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Gly 1 1 eAl aTyr AspProLeuMe tLeuLy sHi sG InCy s Va ICysGly 
1 ggaattgcctatgaccccttgatgctgaaacaccagtgcgtttgtggc 
ccttaacggatactggggaactacgactttgtggtcacgcaaacaccg 

AsnSerThrThrHisProGluHisAlaGlyArglleGlnSerlleTrp 
49 aattccaccacccaccctgagcatgctggacgaatacagagtatctgg 
ttaaggtggtgggtgggactcgtacgacctgcttatgtctcatagacc 

SerArgLeuGlnGluThrGlyLeuLeuAsnLysCysGluArglleGln 
97 tcacgactgcaagaaactgggctgctaaataaatgtgagcgaattcaa 
agtgctgacgttctttgacccgacgatttatttacactcgcttaagtt 

GlyArgLysAlaSerLeuGluGluIleGlnLeuValHisSerGluHis 
145 ggtcgaaaagccagcctggaggaaatacagcttgttcattctgaacat 
ccagc 1 1 1 tcggtcggacc tec t ttatgtcgaacaagtaagac ttgta 

HisSerLeuLeuTyrGlyThrAsnProLeuAspGlyGlnLysLeuAsp 
193 cactcactgttgtatggcaccaaccccctggacggacagaagctggac 
gtgagtgacaacataccgtggttgggggacctgcctgtcttcgacctg 

ProArglleLeuLeuGlyAspAspSerGlnLysPhePheSerSerLeu 
241 cccaggatactcctaggtgatgactctcaaaagtttttttcctcatta 
gggtcctatgaggatccactactgagagttttcaaaaaaaggagtaat 

Pr oCy sGlyGlyLeuGlyVa 1 SerThr 
289 ccttgtggtggacttggggtaagtaca 
ggaacaccacctgaaccccattcatgt 



(57) Abstract: The present invention 
relates to newly discovered human 
hi stone deacetylases (HDACs), also 
referred to as histone deacetylase-like 
polypeptides. The polynucleotide 
sequences and encoded polypeptides 
of the novel HDACs are encompassed 
by the invention, as well as vectors 
comprising these polynucleotides and 
host cells comprising these vectors. 
The invention also relates to antibodies 
that bind to the disclosed HDAC 
polypeptides, and methods employing 
these antibodies. Also related are 
methods of screening for modulators, 
such as inhibitors or antagonists, or 
agonists. The invention also relates to 
diagnostic and therapeutic applications 
which employ the disclosed HDAC 
polynucleotides, polypeptides, and 
antibodies, and HDAC modulators. 
Such applications can be used with 
diseases and disorders associated with 
abnormal cell growth or proliferation, 
cell differentiation, and cell survival, 
e.g., neoplastic cell growth, and 
especially breast and prostate cancers 
or tumors. 
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NOVEL HUMAN HISTONE DEACETYLASES 

RELATED APPLICATIONS 

This application is a continuation-in-part of U.S. Application Serial No. 
5 60/298,296, filed June 14, 2001, which is incorporated by reference in its 
entirety. 

FIELD OF THE INVENTION 

The present invention relates to novel members of the histone 
deacetylase (HDAC) family, including BMY_HDAL1 , BMY_HDAL2, 

10 BMYJHDAL3, BMY_HDACX_v1 , BMYJHDACX_v2, and HDAC9c. 
Specifically related are nucleic acids encoding the polypeptide sequences, 
vectors comprising the nucleic acid sequences, and antibodies that bind to the 
encoded polypeptides. In addition, the invention relates to pharmaceutical 
compositions and diagnostic reagents comprising one or more of the 

15 disclosed HDAC components. The present invention also relates to methods 
of treating a disease or disorder caused by malfunction of an HDAC, e.g., due 
to mutation or altered gene expression. The invention further relates to 
methods of using a modulator of an HDAC of the present invention to treat or 
ameliorate a disease state. Also related are methods for devising antisense 

20 therapies and prophylactic treatments using the HDACs of the invention. In 
particular, the disclosed HDAC components and methods may be used to 
prevent, diagnose, and treat diseases and disorders associated with abnormal 
cell growth or proliferation, cell differentiation, or cell survival, e.g., neoplasias, 
cancers, and tumors, such as breast and prostate cancers or tumors, and 

25 neurodegerative diseases. 

BACKGROUND OF THE INVENTION 
Chromatin is a dynamic protein-DNA complex which is modulated by 
posMranslational modifications. These modifications, in turn, regulate cellular 
processes such as gene transcription and replication. Key chromatin 

30 modifications include the acetylation and deacetylation of nucelosomal 
histone proteins. Acetylation is catalyzed by histone acetylases (HATs), 
whereas deacetylation is catalyzed by deacetylases (HDACs or HDAs). 
HDACs catalyze the removal of acetyl groups from the N-termini of histone 
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core proteins to produce more negatively charged chromatin. This results in 
chromatin compaction, which shuts down gene transcription. In addition, 
inhibition of HDACs results in the accumulation of hyperacetylated histones. 
This, in turn, is implicated in a variety of cellular responses, including altered 
5 gene expression, cell differentiation, and cell-cycle arrest (see, generally, S.G. 
Gray et al., 2001, Exp. Cell Res. 262(2):75-83, and U.S. Patent Nos. 
6,1 10,697 and 6,068,987 to Dulski et al.). 

The HDAC gene family is composed of two distinct classes. Class I 
HDACs are related to the yeast transcriptional regulator, RPD3. Class II 

10 HDACs include a subgroup of proteins containing a C-terminal catalytic 
domain as well as a separate N-terminal domain with transcriptional 
repression activity. Class III HDAC proteins are related to the yeast sir2 
protein and require NAD for activity. Class I HDACs are predominantly 
nuclear, whereas class II HDACs are transported between the cytoplasm and 

15 nucleus as part of the regulation of cellular proliferation and/or differentiation 
(reviewed in S. Khochbin et al., 2001, Cum Opin. Genet Dev. 11(2):162-6). 

The best characterized substrates for HDACs include histone or 
histone-like peptide sequences containing N-terminal lysines. However, non- 
histone HDAC substrates have also been identified, including several 

20 transcription factors. Non-histone substrates for HDACs include p53, 
androgen receptor, LEF1/TCF4 (B.R. Henderson et al., 2002, J. Biol Chem. t 
published online on May 1, 2002 as Manuscript M1 10602200), GATA-1, and 
estrogen receptor-alpha (reviewed in D.M. Vigushin et al., 2002, Anticancer 
Drugs 13(1):1-13). For these substrates, deacetylation has been shown to 

25 regulate DNA/protein interactions or protein stability. Such molecules may 
therefore represent therapeutic targets of HDACs. Importantly, the histone 
deacetylase function of HDACs represses transcription by removing the acetyl 
moieties from amino terminal lysines on histones, thereby resulting in a 
compact chromatin structure. In contrast, the non-histone deacetylase 

30 function of HDACs can either repress or activate transcription. 

There has been considerable interest in modulating the activity of 
HDACs for the treatment of a variety of diseases, particularly cancer. Several 
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small molecule inhibitors of HDAC have shown anti-proliferative activities on a 
number of tumor cell lines and potent anti-tumor activity in pre-clinical tumor 
xenograft models, most recently, CBHA (D.C. Coffey et al., 2001, Cancer 
Res. 61(9):3591-4), pyroxamide, (LML Butler et al, 2001, Clin. Cancer Res. 
5 7(4):962-70), and CHAP31 (Y. Komatsu et al., 2001, Cancer Res. 
61(11):4459-66). Several inhibitors are presently being evaluated as single 
agents and in combination regimens with cytotoxic agents for the treatment of 
advanced malignancies (reviewed in P.A. Marks et al., Cum Opin. Oncol. 
2001 Nov;13(6):477-83). Thus, HDAC inhibitors are being developed as anti- 

10 tumor agents, as well as agents useful for gene therapy (Mclnerney et al., 
2000, Gene Ther. 7(8):653-663). 

Small molecule inhibitors of HDAC activity that have undergone 
extensive analysis include trichostatin A (TSA), trapoxin, SAHA (V.M. Richon 
et al., 2001, Blood Cells Mot. Dis. 27(1):260-4), CHAPs (Y. Komatsu et al., 

15 2001, Cancer Res. 61(11):4459-66), MS-27-275 (reviewed in M. Yoshida et 
al., 2001, Cancer Chemother. Pharmacol. 48 Suppl. 1:S20-6), depsipeptide 
(FR901228; FK228; see, e.g., V. Sandor et al., 2002, Clin. Cancer Res. 
8(3):718-28), and CI-994 (see, e.g., P.M. LoRusso et al., 1996, New Drugs 
14(4):349-56; S. Prakash et al., 2001, Invest. New Drugs 19(1):1-11). 

20 Trichostatin A and trapoxin have been reported to be reversible and 
irreversible inhibitors, respectively, of mammalian histone deacetylase 
(Yoshida et al, 1995, Bioassays, 17(5):423-430). Trichostatin A has also 
been reported to inhibit partially purified yeast histone deacetylase (Sanchez 
del Pino et al., 1994, Biochem. J., 303:723-729). Moreover, trichostatin A is 

25 an antifungal antibiotic and has been shown to have anti-trichomonal activity 
and cell differentiating activity in murine erythroleukemia cells, as well as the 
ability to induce phenotypic reversion in ras-transformed fibroblast cells (see 
e.g. U.S. Pat. No. 4,218,478; and Yoshida et al., 1995, Bioassays, 17(5):423- 
430, and references cited therein). Trapoxin A, a cyclic tetrapeptide, induces 

30 morphological reversion of v-sis-transformed NIH/3T3 cells (Yoshida and 
Sugita, 1992, Jap. J. Cancer Res., 83(4):324-328). 
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The therapeutic effects of HDAC inhibition are believed to occur 
through the induction of differentiation and/or apoptosis through the up- 
regulation of genes such as the cyclin dependent kinase inhibitors, p21 and 
p27 (see, e.g., W. Wharton et al., 2000, J. Biol. Chem. 275(43) :33981 -7; L 
5 Huang et al., 2000, Mol. Med 6(10):849-66). Although known HDAC 
inhibitors are efficacious as anti-tumor agents, they are also associated with 
toxicity (see, e.g., V. Sandor et al., 2002, Clin. Cancer Res. 8(3):718-28). 
Such toxicity is believed to be caused by a non-selective mechanism of 
targeting multiple HDACs. Despite the potent anti-tumor activity of HDAC 

10 inhibitors, it is still unclear which HDACs are necessary to produce an anti- 
proliferative response. Furthermore, little progress has been made in 
comparing the HDAC gene expression profiles in tumor versus normal cells. 
Differential HDAC expression may underlie the tumor-selective responses of 
HDAC inhibition. In addition, a cellular growth advantage may be conferred 

15 by the expression of particular HDACs. Therefore, there is a need for further 
insight into the consequences of selective HDAC inhibition, or activation. 

SUMMARY OF THE INVENTION 

The present invention provides novel histone deacetylase (HDAC) 
nucleic acid sequences and their encoded polypeptide products, also called 
20 histone deacetylase like (HDAL) sequences and products herein, as well as 
methods and reagents for modulating HDACs. 

It is an aspect of this invention to provide new HDAC nucleic acid or 
protein sequences, or cell lines overexpressing HDAC nucleic acid and/or 
encoded protein, for use in assays to identify small molecules which modulate 
25 HDAC activity, preferably antagonize HDAC activity. 

It is another aspect of the present invention to employ HDAC protein 
structural data for the in silico identification of small molecules which modulate 
HDAC activity. This structural data could be generated by experimental 
techniques (for example, X-Ray crystallography or NMR spectroscopy) or by 
30 computational modeling based on available histone deacetylase structures 
(for example, M.S. Finnin et al., 1999, Nature, 401 (6749): 188-1 93). 
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Another aspect of the present invention provides modulators of HDAC 
activity, e.g., antagonists or inhibitors, and their use to treat neoplastic cells, 
e.g., cancer cells and tumor cells. In one aspect of the invention, breast or 
prostate cancers or tumors are treated using the HDAC modulators. The 
5 modulators of the invention can be employed alone or in combination with 
standard anti-cancer regimens for neoplastic cell, e.g., tumor and cancer, 
treatments. 

In addition, the present invention provides diagnostic reagents (i.e., 
biomarkers) for the detection of cancers, tumors, or neoplastic growth. In one 
10 embodiment, HDAC (e.g., HDAC9c) nucleic acids or anti-HDAC antibodies 
are used to detect the presence of specific cancers or tumors, such as breast 
or prostate cancers or tumors. 

It is yet another aspect of the present invention to employ HDAC 
inhibitors in the regulation of the differentiation state of normal cells such as 
15 hematopoietic stem cells. According to this invention, a method is provided 
for the use of modulators of HDAC in ex vivo therapies, particularly as a 
means to modulate the expression of gene therapeutic vectors. 

Yet another aspect of this invention is to provide antisense nucleic 
acids and oligonucleotides for use in the regulation of HDAC and HDAL gene 
20 transcription or translation. 

An additional aspect of this invention pertains to the use of HDAC 
nucleic acid sequences and antibodies directed against the produced protein 
for prognosis or susceptibility for certain disorders (e.g., breast or prostate 
cancer). 

25 Further aspects, features and advantages of the present invention will 

be better appreciated upon a reading of the detailed description of the 
invention when considered in connection with the accompanying 
figures/drawings. 

BRIEF DESCRIPTION OF THE FIGURES 

30 The file of this patent contains at least one figure executed in color. 

Copies of this patent with color figure(s) will be provided by the. Patent and 
Trademark Office upon request and payment of the necessary fee. 
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FIG. 1 shows the novel BMY_HDAL1 partial nucleic acid (cDNA) 
sequence (SEQ ID NO:1) and the encoded amino acid sequence (SEQ ID 
NO:2) of the BMY_HDAL1 polypeptide product. The top line in each group of 
Fig. 1 presents the BMYJHDAL1 protein sequence (SEQ ID NO:2) in 3-letter 
5 IUPAC form; the middle line presents the nucleotide sequence of the 
BMY_HDAL1 coding strand (i.e., SEQ ID NO:1); and the bottom line presents 
the nucleotide sequence of the reverse strand (SEQ ID NO:3). 

FIGS. 2A and 2B show the amino acid sequences of the novel histone 
deacetylase-like proteins BMYJHDAL1 (SEQ ID NO:2), BMY_HDAL2 (SEQ 

10 ID NO:4) and BMY_HDAL3 (SEQ ID NO:5) aligned with the following known 
histone deacetylase proteins: S. cerevisiae HDA1 (SCJHDA1), (SEQ ID 
NO:6); human HDAC4 (HDA4), (SEQ ID NO:7); human HDAC5 (HDA5), 
(SEQ ID NO:8); human HDAC7 (HDA7), (SEQ ID NO:9) and to a histone 
deacetylase-like protein ACUC from Aquifex aeolicus (AQUIFEXJHDAL), 

15 (SEQ ID NO:10), (M.S. Finnin et al., 1999, Nature, 401(6749):188-193). 
Residues identical among all proteins are in shown in black text on a gray 
background. The sequences were aligned using the ClustalW algorithm as 
. implemented in the VectorNTI sequence analysis package (1998, 5.5 Ed., 
Informax, Inc.) with a gap opening penalty of 10, a gap extension penalty of 

20 0.1 and no end gap penalties. 

FIGS. 3A and 3B show a GenewiseDB comparison of BMYJHDAL1 
amino acid sequence (SEQ ID NO:2) and human HDAC5 (HDA5) amino acid 
sequence (SEQ ID NO:8). Genewise results from HDA5_HUMAN_run2 
applied to AC002088 nucleic acid (coding) sequence. (SEQ ID NO:11). 

25 FIG. 4 presents the results of sequence motif analysis of motifs within 

the BMY_HDAL1 amino acid sequence. 

FIG. 5 shows the novel BMY_HDAL2 partial nucleic acid (cDNA) 
sequence (SEQ ID NO:12) and the encoded amino acid sequence (SEQ ID 
NO:4) of the BMYJHDAL2 polypeptide product. The top line in each group of 

30 Fig. 5 presents the BMYJHDAL2 protein sequence (SEQ ID NO:4) in 3-letter 
IUPAC form; the middle line presents the nucleotide sequence of the 
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BMY_HDAL2 coding strand (i.e., SEQ ID NO:12); and the bottom line 
presents the nucleotide sequence of the reverse strand (SEQ ID NO:13). 

FIG. 6 presents a GenewiseDB comparison of the BMYJHDAL2 amino 
acid sequence (SEQ ID NO:4) and human HDAC5 (HDA5) amino acid 
5 sequence (SEQ ID NO:8). Genewise results from HDA5_HUMAN_run3 
applied to AC002410 nucleic acid sequence (SEQ ID NO:14). 

FIG. 7 shows PROSITE motifs identified in the predicted amino acid 
sequence of the novel BMY_HDAL2 (SEQ ID NO:4). MOTIFS are from: 
bmyjidal2.aa.fasta. 

10 FIGS. 8A and 8B show the sequences of the N- and C-terminal 

sequences of BMY_HDAL3 as determined from BAC AC004994 and BAC 
AC004744. FIG. 8A presents the most N-terminal region of the BMYJHDAL3 
amino acid sequence (SEQ ID NO:15) presented herein as encoded by the 
human genomic BAC AC004994 polynucleotide sequence (SEQ ID NO:17). 

15 FIG. 8B presents an additional C-terminal portion of the BMYJWDAL3 amino 
acid sequence (SEQ ID NO:16) as encoded by human genomic BAC 
AC004744 polynucleotide sequence (SEQ ID NO:18). 

FIG. 9 shows partial transcripts identified from the AC004994 
polynucleotide sequence (SEQ ID NO:17) and from the AC004744 

20 polynucleotide sequence (SEQ ID NO:18) assembled into a single contig, 
which was designated BMYJHDAL3 (SEQ ID NO:19) using the VectorNTI 
ContigExpress program (Informax, Inc.). 

FIG. 10 presents the BMYJHDAL3 partial nucleic acid sequence (SEQ 
ID NO:19) and the encoded amino acid sequence (SEQ ID NO:5) based on 

25 the assembled BMYJHDAL3 sequence described in FIG. 9. The top line in 
each group of FIG. 10 presents the BMY_HDAL3 protein sequence (SEQ ID 
NO:5) in 3-letter IUPAC form; the middle line presents the nucleotide 
sequence of the BMY_HDAL3 coding strand (i.e., SEQ ID NO: 19); and the 
bottom line presents the nucleotide sequence of the reverse strand (SEQ ID 

30 NO:20). 

FIG. 11 presents the results of the GCG Motifs program used to 
analyze the BMYJHDAL3 partial predicted amino acid sequence for motifs in 



7 



WO 02/102323 



PCT/US02/19560 



the PROSITE collection (K. Hofmann et al., 1999, Nucleic Acids Res., 
27(1):215-219) with no allowed mismatches. 

FIG, 12 shows a multiple sequence alignment of the novel human 
HDAC, BMY_HDAL3, amino acid sequence (SEQ ID NO:5) with the amino 

5 acid sequence of AAC78618 (SEQ ID NO:21) and with the amino acid 
sequence of AAD15364 (SEQ ID NO:22). AAC78618 is a histone 
deacetylase-like protein predicted by genefinding and conceptual translation 
of AC004994 and which was entered in Genbank. AAD15364 is a similar 
predicted protein derived from AC004744 and entered in Genbank. 

0 AAC78618, AAD15364 and BMYJHDAL3 were aligned using the ClustalW 
algorithm as implemented in the VectorNTI sequence analysis package 
(1998, 5.5 Ed., Informax, Inc.) with a gap opening penalty of 10, a gap 
extension penalty of 0.1 and no end gap penalties. Residues identical among 
all proteins are shown in white text on a black background; conserved 

5 residues are shown in black text on a gray background. 

FIG. 13 shows a BLASTN alignment of the AA287983 polynucleotide 
sequence (SEQ ID NO:23) and BMY_HDAL3 polynucleotide sequence from 
SEQ ID NO: 19. Genbank accession AA287983 is a human EST sequence 
(Gl # 1933807; Incyte template 1080282.1) which was identified by BLASTN 

0 searches against the Incyte LifeSeq database using the NCBI Blast algorithm 
(S.F. Altschul et al., 1997, Nucl. Acids Res., 25(17):3389-3402) with default 
parameters. The AA287983 human EST was isolated from a germinal B-cell 
library. No additional ESTs are included in the Incyte template derived from 
this cluster (Incyte gene ID 180282). 

5 FIGS. 14A-14H present other histone deacetylase sequences, as 

shown in FIGS. 2A and 2B. FIG. 14A: Aquifex ACUC protein amino acid 
sequence (SEQ ID NO:10); FIG. 14B: Saccharomyces cerevisiae histone 
deacetylase 1 amino acid sequence (SEQ ID NO:6); FIG. 14C: Homo 
sapiens histone deacetylase 4 amino acid sequence (SEQ ID NO:7); FIG. 

) 14D: Homo sapiens histone deacetylase 5 amino acid sequence (SEQ ID 
NO:8); FIG. 14E: Homo sapiens histone deacetylase 7 amino acid sequence 
(SEQ ID NO:9); FIG. 14F: Human EST AA287983 nucleic acid sequence 
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(SEQ ID NO:23); FIG. 14G: Human predicted protein AAD15364 amino acid 
sequence(SEQ ID NO:22); and FIG. 14H: Human predicted protein 
AAC78618 amino acid sequence (SEQ ID NO:21). 

FIGS. 15A-15C depict the nucleotide and amino acid sequence 
5 information for HDAC9c. The polypeptide sequence (SEQ ID NO:87) is 
shown using the standard 3-letter abbreviation for amino acids. The DNA 
sequence (SEQ ID NO:88) of the coding strand is also shown. FIGS. 15D- 
15F depict an amino acid sequence alignment of HDAC9c. The predicted 
amino acid sequence of HDAC9c (SEQ ID NO:87) was aligned to previously 

10 identified HDACs, including HDAC9 (AY032737; SEQ ID NO:89), HDAC9a 
(AY032738; SEQ ID NO:90), and HDAC4 (ALF1 32608; SEQ ID NO:91), using 
ClustalW (D.G. Higgins et al., 1996, Methods Enzymol. 266:383-402). 
Identical amino acids are shown in white text on a black background; 
conserved amino acids are shown in black text on a gray background. 

15 FIGS. 16A-16C depict expression levels of HDAC9 in human cancer 

cell lines and normal adult tissue. FIG 16A: Northern blot analysis of HDAC9 
expression in normal adult tissue. FIG 16B: Quantitative PCR mRNA 
analysis of HDAC9 expression in human tumor cell lines. FIG 16C: Nuclease 
protection assay analysis of HDAC9 expression in human tumor cell lines. 

20 FIG. 16D shows the nucleotide sequence of HDAC9c used to derive the 
probes used for Northern blotting and nuclease protection analysis (SEQ ID 
NO:92). The probes were derived from the HDAC9c nucleotide sequence, 
and were predicted to hybridize to HDAC9c and HDAC9 (AY032737), but not 
HDAC9a (AY032738). 

25 FIGS. 17A-17C illustrate the increase of HDAC9 gene expression in 

human cancer tissues. FIGS. 17A-17B: Summary of HDAC9 expression in 
selected tissues, as assayed by in situ hybridization. FIG. 17C: 
Photomicrographs of representative cells showing HDAC9 or actin staining. 
FIG. 18 shows HDAC9c-mediated induction of morphological 

30 transformation of NIH/3T3 cells. The panels show photomicrographs of soft 
agar growth of vector (upper panel), FGF8 (middle panel) and HDAC9c (lower 
panel) transfected NIH/3T3 cells. Cells are shown at 10 X magnification. 



9 



WO 02/102323 



PCT/US02/19560 



FIG. 19 shows HDAC9c induction of actin stress fiber formation in 
NIH/3T3 cells. Stable NIH/3T3 cells expressing the indicated constructs were 
stained with phalloidin-TRITC and visualized by fluorescent microscopy. 

FIGS. 20A-20C depict the nucleotide and amino acid sequence 
5 information for BMYJHDACX variant 1, also called BMY_HDACX_v1 and 
HDACX_v1 . BMY_HDACX_v1 represents a partial cDNA sequence obtained 
from cells expressing a transcript variant of human HDAC9. The polypeptide 
sequence (SEQ ID NO:93) is shown using the standard 3-letter abbreviation 
for amino acids. The DNA sequence (SEQ ID NO:94) of the coding strand is 
10 also shown. 

FIGS. 21A-21B depict the nucleotide and amino acid sequence 
information for BMY_HDACX variant 2, also called BMY_HDACX_v2 and 
HDACX_v2. BMY_HDACX_v2 represents a full-length sequence of a novel 
transcript variant (i.e., splice product) of HDAC9. The polypeptide sequence 
15 (SEQ ID NO:95) is shown using the standard 3-letter abbreviation for amino 
acids. The DNA sequence (SEQ ID NO:96) of the coding strand is also 
shown. 

FIGS. 22A-22I depict the nucleotide and. amino acid sequence 
information for the previously identified HDAC9 transcript variants. FIGS. 

20 22A-22C: HDAC9 variant 1 (HDAC9v1; NCBI Ref. Seq. NM_058176). The 
polypeptide sequence (SEQ ID NO:89) is shown using the standard 3-letter 
abbreviation for amino acids. The DNA sequence (SEQ ID NO:97) of the 
coding strand is also shown. FIGS. 22D-22F: HDAC9 variant 2 (HDAC9v2; 
NCBI Ref. Seq. NM_058177). The polypeptide sequence (SEQ ID NO:90) is 

25 shown using the standard 3-letter abbreviation for amino acids. The DNA 
sequence (SEQ ID NO:98) of the coding strand is also shown. FIGS. 22G- 
221: HDAC9 variant 3 (HDAC9v3; NCBI Ref. Seq. NM_014707). The 
polypeptide sequence (SEQ ID NO:99) is shown using the standard 3-letter 
abbreviation for amino acids. The DNA sequence (SEQ ID NO:100) of the 

30 coding strand is also shown. 

FIGS. 23A-23K depict a multiple sequence alignment of nucleotide 
sequences representing known and novel HDAC9 splice products. The 
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cDNAs for BMY_HDACX_v1 (SEQ ID NO:94) and BMY_HDACX_v2 (SEQ ID 
NO:96) nucleotide sequences were aligned to the three reported splice 
products of the HDAC9 gene, including HDAC9v1 (NCBI Ref. Seq. 
NM_058176; SEQ ID NO:97), HDAC9v2 (NCBI Ref .Seq. NMJJ58177; SEQ 
5 ID NO:98), and HDAC9v3 (NCBI Ref. Seq. NM_014707; SEQ ID NO:100) 
using the sequence alignment program ClustalW (D.G. Higgins et al., 1996, 
Methods EnzymoL 266:383-402). The consensus sequence is shown on the 
bottom line (SEQ ID NO:106). Identical nucleotides are shown in white text 
on a black background. Selected splice junctions are indicated below the 

10 alignment; these junctions were identified by comparison of the cDNA 
sequences to the assembled genomic contig NTJ30798.1 using the Sim4 
algorithm (L. Florea et al., 1998, Genome Res. 8:967-74). It is noted that the 
HDAC9 (AY032737) nucleotide and amino acid sequences are identical to the 
HDAC9v1 (NM_058176) nucleotide and amino acid sequences. Similarly, the 

15 HDAC9a (AY032738) nucleotide and amino acid sequences are identical to 
the HDAC9v2 (NM_058177) nucleotide and amino acid sequences. 

FIGS. 24A-24D depict a multiple sequence alignment of amino acid 
sequences representing known and novel HDAC polypeptides. The amino 
acid sequences encoded by transcript variants BMY_HDACX_y1 (SEQ ID 

20 NO:93) and BMY_HDACX_v2 (SEQ ID NO:95) were aligned to amino acid 
sequences encoded by known splice variants of human histone deacetylase 9 
including HDAC9v1 (NCBI Ref. Seq. NM_058176; SEQ ID NO:89), HDAC9v2 
(NCBI Ref .Seq. NM_058177; SEQ ID NO:90), and HDAC9v3 (NCBI Ref. 
Seq. NMJ314707; SEQ ID NO:99), and to human histone deacetylases 4 and 

25 5 (HDA5, SEQ ID NO:8; HDA4, SEQ ID NO:7) using the multiple sequence 
alignment program ClustalW (D.G. Higgins et al., 1996, Methods EnzymoL 
266:383-402). The consensus sequence is shown on the bottom line (SEQ ID 
NO:107). Residues conserved among all polypeptides are shown in white 
text on a. black background; residues conserved in a majority of polypeptides 

30 are shown in black text on a gray background. 

FIGS. 25A-25C depict a multiple sequence alignment of amino acid 
sequences showing novel HDAC polypeptides. The amino acid sequences of 
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BMY_HDAL1 (SEQ ID NO:2), BMY_HDAL2 (SEQ ID NO:4), BMY_HDAL3 
(SEQ ID NO:5), HDAC9c (SEQ ID NO:87), HDACX_v1 (SEQ ID NO:93), and 
HDACX_v2 (SEQ ID NO:95) were aligned using the T-Coffee program (C. 
Notredame et al., 2000, J. Mol. Biol. 302:205-217; C. Notredame et al., 1998, 
5 Bioinformatics 14:407-422). Identical residues are shown in black text on a 
gray background. 
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DESCRIPTION OF THE INVENTION 

The present invention discloses several novel HDAC nucleotide 
sequences and encoded products. New members of the histone deacetylase 
protein family have been identified as having identity to known HDACs. . Three 
5 new HDACs are referred to as BMYJHDAL1 , BMYJHDAL2, and BMY_HDAL3 
herein, wherein HDAL signifies histone deacetylase like proteins in current 
nomenclature. These proteins are most similar to the known human histone 
deacetylase, HDAC9. Novel HDAC9 splice variants, termed HDACX_y1 and 
HDACX_v2, have also been identified. In addition, HDAC9c, an HDAC9- 

10 related family member, has been newly identified and cloned. The nucleic 
acid sequences encoding the novel HDAC polypeptides are provided together 
with the description of the means employed to obtain these novel molecules. 
Such HDAC products can serve as protein deacetylases, which are useful for 
disease treatment and/or diagnosis of diseases and disorders associated with 

15 cell growth or proliferation, cell differentiation, and cell survival, e.g., 
neoplastic cell growth, cancers, and tumors. 

As shown herein, HDAC9 expression is elevated in tumor cell lines, as 
determined by quantitative PGR analysis. Elevated expression of HDAC9 
was also observed in clinical specimens of human tumor tissue compared to 

20 normal tissue, using in situ hybridization (ISH) and an HDAC9-specific 
riboprobe. Further, cell biological assessment of HDAC9c revealed that 
overexpression of HDAC9c confers a growth advantage to normal fibroblasts. 
These results indicate that HDAC9c can be used as a diagnostic marker for 
tumor progression and that selective HDAC9c inhibitors can be used to target 

25 specific cancer or tumor types, such as breast and prostate cancers or 
tumors. 
Definitions 

The following definitions are provided to more fully describe the present 
invention in its various aspects. The definitions are intended to be useful for 
30 guidance and elucidation, and are not intended to limit the disclosed invention 
and its embodiments. 
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HDAC polypeptides (or proteins) refer to the amino acid sequence of 
isolated, and preferably substantially purified, human histone deacetylase 
proteins isolated as described herein. HDACs may also be obtained from any 
species, preferably mammalian, including mouse, rat, non-human primates, 
5 and more preferably, human; and from a variety of sources, including natural, 
synthetic, semi-synthetic, or recombinant. The probes and oligos described 
may be used in obtaining HDACs from mammals other than humans. The 
present invention more particularly provides six new human HDAC family 
members, namely, BMY_HDAL1, BMY__HDAL2, BMYJHDAL3, HDACX_v1, 

10 HDACX_v2, and HDAC9c, their polynucleotide sequences (e.g., SEQ ID 
NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, SEQ 
ID NO:96, and sequences complementary thereto), and encoded products 
(e.g., SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID 
NO:93, and SEQ ID NO:95). 

15 An agonist (e.g., activator) refers to a molecule which, when bound to, 

or interactive with, an HDAC polypeptide, or a functional fragment thereof, 
increases or prolongs the duration of the effect of the HDAC polypeptide. 
Agonists may include proteins, nucleic acids, carbohydrates, or any other 
molecules that bind to and modulate the effect of an HDAC polypeptide. An 

20 antagonist (e.g., inhibitor, blocker) refers to a molecule which, when bound to, 
or interactive with, an HDAC polypeptide, or a functional fragment thereof, 
decreases or eliminates the amount or duration of the biological or 
immunological activity of the HDAC polypeptide. Antagonists may include 
proteins, nucleic acids, carbohydrates, antibodies, or any other molecules that 

25 decrease, reduce or eliminate the effect and/or function of an HDAC 
polypeptide. 

"Nucleic acid sequence", as used herein, refers to an oligonucleotide, 
nucleotide, or polynucleotide (e.g., DNA, cDNA, RNA), and fragments or 
portions thereof, and to DNA or RNA of genomic or synthetic origin which may 
30 be single- or double-stranded, and represent the sense (coding) or antisense 
(non-coding) strand. By way of nonlimiting example, fragments include 
nucleic acid sequences that can be about 10 to 60 contiguous nucleotides in 
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length, preferably, at least 15-60 contiguous nucleotides in length, and also 
preferably include fragments that are at least 70-1 00 contiguous nucleotides, 
or which are at least 1000 contiguous nucleotides or greater in length. 
Nucleic acids for use as probes or primers may differ in length as described 
5 herein. 

In specific embodiments, HDAC polynucleotides of the present 
invention can comprise at least 15, 20, 25, 50, 100, 150, 200, 250, 300, 350, 
400, 450, 500, 600, 700, 800, 900, 1000, 1195, 1200, 1500, 2000, 2160, 
2250, 2500, 2755, or 2900 contiguous nucleotides of SEQ ID NO:1, SEQ ID 

10 NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, SEQ ID NO:96, or a 
sequence complementary thereto. Additionally, a polynucleotide of the 
invention can comprise a specific region of a HDAC nucleotide sequence, 
e.g., a region encoding the C-terminal sequence of the HDAC polypeptide. 
Such polynucleotides can comprise, for example, nucleotides 3024-4467 of 

15 HDAC9c (SEQ ID NO:88), nucleotides 2156-3650 of HDACX_v1 (SEQ ID 
NO:94), nucleotides 1 174-3391 of HDACX_v2 (SEQ ID NO:96), or portions or 
fragments thereof. 

As specific examples, polynucleotides of the invention may comprise at 
least 183 contiguous nucleotides of SEQ ID NO:88; or at least 17 contiguous 

20 nucleotides of SEQ ID NO:96. As additional examples, the polynucleotides of 
the invention may comprise nucleotides 1 to 3207 of SEQ ID NO:88; 
nucleotides 1 to 2340 of SEQ ID NO:94; or nucleotides 307 to 1791 of SEQ ID 
NO:96. Further, the polynucleotides of the invention may comprise 
nucleotides 4 to 3207 of SEQ ID NO:88, wherein said nucleotides encode 

25 amino acids 2 to 1069 of SEQ ID NO:87 lacking the start methionine; or 
nucleotides 310 to 1791 of SEQ ID NO:96, wherein said nucleotides encode 
amino acids 2 to 495 of SEQ ID NO:95 lacking the start methionine. In 
addition, polynucleotides of the invention may comprise nucleotides 3024- 
3207 of SEQ ID NO:88; or nucleotides 1 174-1791 of SEQ ID NO:96. 

30 "Amino acid sequence" as used herein refers to an oligopeptide, 

peptide, polypeptide, or protein sequence, and fragments or portions thereof, 
and to naturally occurring or synthetic molecules. Amino acid sequence 
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fragments are typically from about 4 or 5 to about 35, preferably from about 5 
to about 15 or 25 amino acids in length and, optimally, retain the biological 
activity or function of an HDAC polypeptide. However, it will be understood 
that larger amino acid fragments can be used, depending on the purpose 
5 therefor, e.g., fragments of from about 15 to about 50 or 60 amino acids, or 
greater. 

Where "amino acid sequence" is recited herein to refer to an amino 
acid sequence of a naturally occurring protein molecule, "amino acid 
sequence" and like terms, such as "polypeptide" or "protein" are not meant to 
10 limit the amino acid sequence to the complete, native amino acid sequence 
associated with the recited protein molecule. In addition, the terms HDAC 
polypeptide and HDAC protein are frequently used interchangeably herein to 
refer to the encoded product of an HDAC nucleic acid sequence of the 
present invention. 

15 A variant of an HDAC polypeptide can refer to an amino acid sequence 

that is altered by one or more amino acids. The variant may have 

"conservative" changes, wherein a substituted amino acid has similar 

* 

structural or chemical properties, e.g., replacement of leucine with isoleucine. 
More rarely, a variant may have "nonconservative" changes, e.g., 

20 replacement of a glycine with a tryptophan. Minor variations may also include 
amino acid deletions or insertions, or both. Guidance in determining which 
amino acid residues may be substituted, inserted, or deleted without 
abolishing functional biological or immunological activity may be found using 
computer programs well known in the art, for example, DNASTAR software. 

25 An allele or allelic sequence is an alternative form of an HDAC nucleic 

acid sequence. Alleles may result from at least one mutation in the nucleic 
acid sequence and may yield altered mRNAs or polypeptides whose structure 
or function may or may not be altered. Any given gene, whether natural or 
recombinant, may have none, one, or many allelic forms. Common 

30 mutational changes that give rise to alleles are generally ascribed to natural 
deletions, additions, or substitutions of nucleotides. Each of these types of 
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changes may occur alone, or in combination with the others, one or more 
times in a given sequence. 

Altered nucleic acid sequences encoding an HDAC polypeptide include 
nucleic acid sequences containing deletions, insertions and/or substitutions of 
5 different nucleotides resulting in a polynucleotide that encodes the same or a 
functionally equivalent HDAC polypeptide. Altered nucleic acid sequences 
may further include polymorphisms of the polynucleotide encoding an HDAC 
polypeptide; such polymorphisms may or may not be readily detectable using 
a particular oligonucleotide probe. The encoded protein may also contain 

10 deletions, insertions, or substitutions of amino acid residues, which produce a 
silent change and result in a functionally equivalent HDAC protein of the 
present invention. Deliberate amino acid substitutions may be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, 
and/or the amphipathic nature of the residues, as long as the biological 

15 activity or function of the HDAC protein is retained. For example, negatively 
charged amino acids may include aspartic acid and glutamic acid; positively 
charged amino acids may include lysine and arginine; and amino acids with 
uncharged polar head groups having similar hydrophilicity values may include 
leucine, isoleucine, and valine; glycine and alanine; asparagine and 

20 glutamine; serine and threonine; and phenylalanine and tyrosine. 

"Peptide nucleic acid" (PNA) refers to an antisense molecule or anti- 
gene agent which comprises an oligonucleotide foligo") linked to a peptide 
backbone of amino acid residues, which terminates in lysine. PNA typically 
comprise oligos of at least 5 nucleotides linked to amino acid residues. These 

25 small molecules stop transcript elongation by binding to their complementary 
strand of nucleic acid (P.E. Nielsen et al., 1993, Anticancer Dmg Des., 8:53- 
63). PNA may be pegylated to extend their lifespan in the cell where they 
preferentially bind to complementary single stranded DNA and RNA. 

Oligonucleotides or oligomers refer to a nucleic acid sequence, 

30 preferably comprising contiguous nucleotides, typically of at least about 6 
nucleotides to about 60 nucleotides, preferably at least about 8 to 10 
nucleotides in length, more preferably at least about 12 nucleotides in length, 
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e.g., about 15 to 35 nucleotides, or about 15 to 25 nucleotides, or about 20 to 
35 nucleotides, which can be typically used, for example, as probes or 
primers, in PCR amplification assays, hybridization assays, or in microarrays. 
It will be understood that the term oligonucleotide is substantially equivalent to 
5 the terms primer, probe, or amplimer, as commonly defined in the art. It will 
also be appreciated by those skilled in the pertinent art that a longer 
oligonucleotide probe, or mixtures of probes, e.g., degenerate probes, can be 
used to detect longer, or more complex, nucleic acid sequences, for example, 
genomic DNA. In such cases, the probe may comprise at least 20-200 
10 nucleotides, preferably, at least 30-100 nucleotides, more preferably, 50-100 
nucleotides. 

Amplification refers to the production of additional copies of a nucleic 
acid sequence and is generally carried out using polymerase chain reaction 
(PCR) technologies, which are well known and practiced in the art (See, D.W. 

15 Dieffenbach and G.S. Dveksler, 1995, PCR Primer, a Laboratory Manual, 
Cold Spring Harbor Press, Plainview, NY). 

Microarray is an array of distinct polynucleotides or oligonucleotides 
synthesized on a substrate, such as paper, nylon, or other type of membrane; 
filter; chip; glass slide; or any other type of suitable solid support. 

20 The term antisense refers to nucleotide sequences, and compositions 

containing nucleic acid sequences, which are complementary to a specific 
DNA or RNA sequence. The term "antisense strand" is used in reference to a 
nucleic acid strand that is complementary to the "sense" strand. Antisense 
(i.e., complementary) nucleic acid molecules include PNA and may be 

25 produced by any method, including synthesis or transcription. Once 
introduced into a cell, the complementary nucleotides combine with natural 
sequences produced by the cell to form duplexes that block either 
transcription or translation. The designation "negative" is sometimes used in 
reference to the antisense strand, and "positive" is sometimes used in 

30 reference to the sense strand. 

The term consensus refers to the sequence that reflects the most 
common choice of base or amino acid at each position among a series of 
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related DNA, RNA, or protein sequences. Areas of particularly good 
agreement often represent conserved functional domains. 

A deletion refers to a change in either nucleotide or amino acid 
sequence and results in the absence of one or more nucleotides or amino 
5 acid residues. By contrast, an insertion (also termed "addition") refers to a 
change in a nucleotide or amino acid sequence that results in the addition of 
one or more nucleotides or amino acid residues, as compared with the 
naturally occurring molecule. A substitution refers to the replacement of one 
or more nucleotides or amino acids by different nucleotides or amino acids. 

1 0 A derivative nucleic acid molecule refers to the chemical modification of 

a nucleic acid encoding, or complementary to, an encoded HDAC 
polypeptide. Such modifications include, for example, replacement of 
hydrogen by an alkyl, acyl, or amino group. A nucleic acid derivative encodes 
a polypeptide that retains the essential biological and/or functional 

15 characteristics of the natural molecule. A derivative polypeptide is one that is 
modified by glycosylation, pegylation, or any similar process that retains the 
biological and/or functional or immunological activity of the polypeptide from 
which it is derived. 

The term "biologically active", i.e., functional, refers to a protein or 

20 polypeptide or peptide fragment thereof having structural, regulatory, or 
biochemical functions of a naturally occurring molecule. Likewise, 
"immunologically active" refers to the capability of the natural, recombinant, or 
synthetic HDAC, or any oligopeptide thereof, to induce a specific immune 
response in appropriate animals or cells, for example, to generate antibodies, 

25 and to bind with specific antibodies. 

An HDAC-related protein refers to the HDAC and HADL proteins or 
polypeptides described herein, as well as other human homologs of these 
HDAC or HDAL sequences, in addition to orthologs and paralogs (homologs) 
of the HDAC or HADL sequences in other species, ranging from yeast to 

30 other mammals, e.g., homologous histone deacetylase. The term ortholog 
refers to genes or proteins that are homologs via speciation, e.g., closely 
related and assumed to have common descent based on structural and 
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functional considerations. Orthologous proteins function as recognizably the 
same activity in different species. The term paralog refers to genes or 
proteins that are homologs via gene duplication, e.g., duplicated variants of a 
gene within a genome. (See, W.M. Fritch, 1970, Syst. ZooL, 19:99-1 13. 

5 It will be appreciated that, under certain circumstances, it may be 

advantageous to provide homologs of one of the novel HDAC polypeptides 
which function in a limited capacity as one of either an HDAC agonist (i.e., 
mimetic), or an HDAC antagonist, in order to promote or inhibit only a subset 
of the biological activities of the naturally-occurring form of the protein. Thus, 

0 specific biological effects can be elicited by treatment with a homolog of 
limited function, and with fewer side effects, relative to treatment with agonists 
or antagonists which are directed to all of the biological activities of naturally- 
occurring forms of HDAC proteins. 

Homologs (i.e., isoforms or variants) of the novel HDAC polypeptides 

5 can be generated by mutagenesis, such as by discrete point mutation(s), or 
by truncation. For example, mutation can yield homologs that retain 
substantially the same, or merely a subset of, the biological activity of the 
HDAC polypeptide from which it was derived. Alternatively, antagonistic 
forms of the protein can be generated which are able to inhibit the function of 

0 the naturally-occurring form of the protein, such as by competitively binding to 
- an HDAC substrate, or HDAC-associated protein. Non-limiting examples of 
such situations include competing with wild-type HDAC in the binding of p53 
or a histone. Also, agonistic forms of the protein can be generated which are 
constitutively active, or have an altered K cat or K m for deacylation reactions. 

5 Thus, the HDAC protein and homologs thereof may be either positive or 
negative regulators of transcription and/or replication. 

The term hybridization refers to any process by which a strand of 
nucleic acid binds with a complementary strand through base pairing. 

The term "hybridization complex' 5 refers to a complex formed between 

) two nucleic acid sequences by virtue of the formation of hydrogen bonds 
between complementary G and C bases and between complementary A and 
T bases. The hydrogen bonds may be further stabilized by base stacking 
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interactions. The two complementary nucleic acid sequences hydrogen bond 
in an anti-parallel configuration. A hybridization complex may be formed in 
solution (e.g., C 0 t or Rot analysis), or between one nucleic acid sequence 
present in solution and another nucleic acid sequence immobilized on a solid 
5 support (e.g., membranes, filters, chips, pins, or glass slides, or any other 
appropriate substrate to which cells or their nucleic acids have been affixed). 

The terms stringency or stringent conditions refer to the conditions for 
hybridization as defined by nucleic acid composition, salt and temperature. 
These conditions are well known in the art and may be altered to identify 

10 and/or detect identical or related polynucleotide sequences in a sample. A 
variety of equivalent conditions comprising either low, moderate, or high 
stringency depend on factors such as the length and nature of the sequence 
(DNA, RNA, base composition), reaction milieu (in solution or immobilized on 
a solid substrata), nature of the ( target nucleic acid (DNA, RNA, base 

15 composition), concentration of salts and the presence or absence of other 
reaction components (e.g., formamide, dextran sulfate and/or polyethylene 
glycol) and reaction temperature (within a range of from about 5°C below the 
melting temperature of the probe to about 20°C to 25°C below the melting 
temperature). One or more factors may be varied to generate conditions, 

20 either low or high stringency, that are different from but equivalent to the 
aforementioned conditions. 

As will be understood by those of skill in the art, the stringency of 
hybridization may be altered in order to identify or detect identical or related 
polynucleotide sequences. As will be further appreciated by the skilled 

25 practitioner, Tm can be approximated by the formulas as known in the art, 
depending on a number of parameters, such as the length of the hybrid or 
probe in number of nucleotides, or hybridization buffer ingredients and 
conditions (See, for example, T. Maniatis et a!., Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 

30 1982 and J. Sambrook et aL, Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989; Current Protocols in 
Molecular Biology Eds. F.M. Ausubel et aL, Vol. 1, "Preparation and Analysis 
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of DNA", John Wiley and Sons, Inc., 1994-1995, Suppls. 26, 29, 35 and 42; 
pp. 2.10.7- 2.10.16; G.M. Wahl and S. L Berger (1987; Methods Enzymoi 
152:399-407); and A.R. Kimmel, 1987; Methods of Enzymoi, 152:507-511). 
As a general guide, Tm decreases approximately 1 9 C -1.5 2 C with every 1% 
5 decrease in sequence homology. Also, in general, the stability of a hybrid is a 
function of sodium ion concentration and temperature. Typically-, the 
hybridization reaction is initially performed under conditions of low stringency, 
followed by washes of varying, but higher stringency. Reference to 
hybridization stringency, e.g., high, moderate, or low stringency, typically 

1 0 relates to such washing conditions. 

Thus, by way of nonlimiting example, high stringency refers to 
conditions that permit hybridization of those nucleic acid sequences that form 
stable hybrids in 0.01 8M NaCI at about 65 e C (i.e., if a hybrid is not stable in 
0.01 8M NaCI at about 65 e C, it will not be stable under high stringency 

15 conditions). High stringency conditions can be provided, for instance, by 
hybridization in 50% formamide, 5 X Denhaifs solution, 5 X SSPE (saline 
sodium phosphate EDTA) (1 X SSPE buffer comprises 0.15 M NaCI, 10 mM 
Na 2 HP0 4 , 1 mM EDTA), (or 1 X SSC buffer containing 150 mM NaCI, 15 mM 
Na 3 citrate • 2 H 2 0, pH 7.0), 0.2% SDS at about 42 Q C, followed by washing in 

20 1 X SSPE (or saline sodium citrate, SSC) and 0.1% SDS at a temperature of 
at least about 42°C, preferably about 55°C, more preferably about 65°C. 

Moderate stringency refers, by way of nonlimiting example, to 
conditions that permit hybridization in 50% formamide, 5 X Denhaifs solution, 
5 X SSPE (or SSC), 0.2% SDS at 42 9 C (to about 50 Q C), followed by washing 

25 in 0.2 X SSPE (or SSC) and 0.2% SDS at a temperature of at least about 
42°C, preferably about 55°C, more preferably about 65°C. 

Low stringency refers, by way of nonlimiting example, to conditions that 
permit hybridization in 10% formamide, 5 X Denhaifs solution, 6 X SSPE (or 
SSC), 0.2% SDS at 42 9 C, followed by washing in 1 X SSPE (or SSC) and 

30 0.2% SDS at a temperature of about 45°C, preferably about 50°C. 

For additional stringency conditions, see T. Maniatis et al., Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring 
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Harbor, NY (1982). It is to be understood that the low, moderate and high 
stringency hybridization / washing conditions may be varied using a variety of 
ingredients, buffers and temperatures well known to and practiced by the 
skilled practitioner. 

5 The terms complementary or complementarity refer to the natural 

binding of polynucleotides under permissive salt and temperature conditions 
by base-pairing. For example, the sequence "A-G-T" binds to the 
complementary sequence T-C-A". Complementarity between two single- 
stranded molecules may be "partial", in which only some of the nucleic acids 

10 bind, or it may be complete when total complementarity exists between single 
stranded molecules. The degree of complementarity between nucleic acid 
strands has significant effects on the efficiency and strength of hybridization 
between nucleic acid strands. This is of particular importance in amplification 
reactions, which depend upon binding between nucleic acids strands, as well 

15 as in the design and use of PNA molecules. 

The term homology refers to a degree of complementarity. There may 
be partial sequence homology or complete homology, wherein complete 
homology is equivalent to identity, e.g., 100% identity. A partially 
complementary sequence that at least partially inhibits an identical sequence 

20 from hybridizing to a target nucleic acid is referred to using the functional term 
"substantially homologous." The inhibition of hybridization of the completely 
complementary sequence to the target sequence may be examined using a 
hybridization assay (e.g., Southern or Northern blot, solution hybridization and 
the like) under conditions of low stringency. A substantially homologous 

25 sequence or probe will compete for and inhibit the binding (i.e., the 
hybridization) of a completely homologous sequence or probe to the target 
sequence under conditions of low stringency. Nonetheless, conditions of low 
stringency do not permit non-specific binding; low stringency conditions 
require that the binding of two sequences to one another be a specific (i.e., 

30 iselective) interaction. The absence of non-specific binding may be tested by 
the use of a second target sequence which lacks even a partial degree of 
complementarity (e.g., less than about 30% identity). In the absence of non- 
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specific binding, the probe will not hybridize to the second non- 
complementary target sequence. 

Those having skill in the art will know how to determine percent identity 
between/among sequences using, for example, algorithms such as those 
5 based on the CLUSTALW computer program (J.D. Thompson et al., 1994, 
Nucleic Acids Research, 2(22):4673-4680), or FASTDB, (Brutlag et al., 1990, 
Comp. App. BioscL, 6:237-245), as known in the art. Although the FASTDB 
algorithm typically does not consider internal non-matching deletions or 
additions in sequences, i.e., gaps, in its calculation, this can be corrected 

10 manually to avoid an overestimation of the % identity. CLUSTALW, however, 
does take sequence gaps into account in its identity calculations. 

Also available to those having skill in this art are the BLAST and 
BLAST 2.0 algorithms (Altschul et al., 1977, Nucl. Acids Res., 25:3389-3402 
and Altschul et al., 1990, J. MoL Biol., 215:403-410). The BLASTN program 

15 for nucleic acid sequences uses as defaults a wordlength (W) of 11, an 
expectation (E) of 10, M=5, N=4, and a comparison of both strands. For 
amino acid sequences, the BLASTP program uses as defaults a wordlength 
(W) of 3, and an expectation (E) of 10. The BLOSUM62 scoring matrix 
(Henikoff and Henikoff, 1989, Proc. Natl. Acad. ScL, USA, 89:10915) uses 

20 alignments (B) of 50, expectation (E) of 10, M=5, N=4, and a comparison of 
both strands. 

An HDAC polynucleotide of the present invention may show at least 
27.7%, 35%, 40%, 44.1%, 48.2%, 50%, 55.4%, 58.6%, 59.8%, 60%, 60.2%, 
67.8%, 70%, 80%, 81.5%, 85%, 90%, 91%, 92%, 93%, 94%, 94.2%, 94.4%, 

25 95%, 96%, 97%, 97.2%, 97.5%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 
99.5%, 99.6%, 99.7%, 99.8%, or 99.9% identity to a sequence provided in 
SEQ ID NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID 
NO:94, SEQ ID NO:96, or a sequence complementary thereto. An HDAC 
polypeptide of the present invention may show at least 25%, 35%, 40%, 45%, 

30 48.1%, 55.2%, 55.3%, 60%, .65%, 70%, 72%, 75%, 79%, 80%, 80.6%, 85%, 
90%, 91%, 92%, 93%, 94%, 94.2%, 95%, 96%, 97%, 97.2%, 97.5%, 98%, 
99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% 
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identity to a sequence provided in any one of SEQ ID NO:2, SEQ ID NO:4, 
SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, or SEQ ID NO:95. 

In a preferred aspect of the invention, a HDAC polynucleotide shows at 
least 60.2%, 81 .5%, or 94.4% identity to the HDAC9c nucleotide sequence 

5 (SEQ ID NO:88 or a sequence complementary thereto); or at least 27.7%, 
48.2%, or 55.4% identity to the HDACX_v2 nucleotide sequence (SEQ ID 
NO:96 or a sequence complementary thereto). A HDAC polypeptide of the 
invention preferably shows at least 55.2%, 80.6%, or 94.2% identity to the 
HDAC9c amino acid sequence (SEQ ID NO:87); at least 55.3% identity to the 

0 HDACX_v2 amino acid sequence (SEQ ID NO:95); at least 72% identity to 
the amino acid sequence of BMYJHDAL1 (SEQ ID NO:2); at least 79% 
identity to the amino acid sequence of BMY_HDAL2 (SEQ. ID NO:4); or at 
least 70% identity to the amino acid sequence of BMY_HDAL3 (SEQ ID 
NO:5). 

5 A composition comprising a given polynucleotide sequence refers 

broadly to any composition containing the given polynucleotide sequence. 
The composition may comprise a dry formulation or an aqueous solution. 
Compositions comprising the polynucleotide sequences (e.g., SEQ ID NO:1, 
SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, or SEQ ID 

0 NO:96) encoding the novel HDAC polypeptides of this invention, or fragments 
thereof, or complementary sequences thereto, may be employed as 
hybridization probes. The probes may be stored in freeze-dried form and may 
be in association with a stabilizing agent such as a carbohydrate. In 
hybridizations, the probe may be employed in an aqueous solution containing 

5 salts (e.g., NaCI), detergents or surfactants (e.g., SDS) and other components 
(e.g., Denhardt's solution, dry milk, salmon sperm DNA, and the like). 

The term "substantially purified" refers to nucleic acid sequences or 
amino acid sequences that are removed from their natural environment, i.e., 
isolated or separated by a variety of means, and are at least 60% free, 

) preferably 75% to 85% free, and most preferably 90% or greater free from 
other components with which they are naturally associated. 
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The term sample, or biological sample, is meant to be interpreted in its 
broadest sense. A biological sample suspected of containing nucleic acid 
encoding an HDAC protein, or fragments thereof, or an HDAC protein itself, 
may comprise a body fluid, an extract from cells or tissue, chromosomes 
5 isolated from a cell (e.g., a spread of metaphase chromosomes), organelle, or 
membrane isolated from a cell, a cell, nucleic acid such as genomic DNA (in 
solution or bound to a solid support such as for Southern analysis), RNA (in 
solution or bound to a solid support such as for Northern analysis), cDNA (in 
solution or bound to a solid support), a tissue, a tissue print and the like. 

10 Transformation refers to a process by which exogenous DNA enters 

and changes a recipient cell. It may occur under natural or artificial conditions 
using various methods well known in the art. Transformation may rely on any 
known method for the insertion of foreign nucleic acid sequences into a 
prokaryotic or eukaryotic host cell. The method is selected based on the type 

15 of host cell being transformed and may include, but is not limited to, viral 
infection, electroporation, heat shock, lipofection, and partial bombardment. 
Such "transformed" cells include stably transformed cells in which the inserted 
DNA is capable of replication either as an autonomously replicating plasmid or 
as part of the host chromosome. Transformed cells also include those cells 

20 that transiently express the inserted DNA or RNA for limited periods of time. 

The term "mimetic" refers to a molecule, the structure of which is 
developed from knowledge of the structure of an HDAC protein, or portions 
thereof, and as such, is able to effect some or all of the actions of HDAC 
proteins. 

25 The term "portion" with regard to a protein (as in "a portion of a given 

protein") refers to fragments or segments, for example, peptides, of that 
protein. The fragments may range in size from four or five amino acid 
residues to the entire amino acid sequence minus one amino acid. Thus, a 
protein "comprising at least a portion of the amino acid sequence of the HDAC 

30 molecules presented herein can encompass a full-length human HDAC 
polypeptide, and fragments thereof. 
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In specific embodiments, HDAC polypeptides of the invention can 
comprise at least 5, 10, 20, 30, 50, 70, 100, 200, 300, 400, 500, 600, 700, 
720, 750, 800, 920, or 950 contiguous amino acid residues of SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, or SEQ ID 
5 NO:95. Additionally, a polypeptide of the invention can comprise a specific 
region, e.g., the C-terminal region, of a HDAC amino acid sequence. Such 
polypeptides can comprise, for example, amino acids 1009-1069 of HDAC9c 
(SEQ ID NO:87), amino acids 720-780 of HDACXjvt (SEQ ID NO:93), or 
portions or fragments thereof. 

10 The term antibody refers to intact molecules as well as fragments 

thereof, such as Fab, F(ab') 2 , Fv, which are capable of binding an epitopic or 
antigenic determinant. Antibodies that bind to the HDAC polypeptides can be 
prepared using intact polypeptides or fragments containing small peptides of 
interest or prepared recombinantly for use as the immunizing antigen. The 

15 polypeptide or oligopeptide used to immunize an animal can be derived from 
the transition of RNA or synthesized chemically, and can be conjugated to a 
carrier protein, if desired. Commonly used carriers that are chemically 
coupled to peptides include bovine serum albumin (BSA), keyhole limpet 
hemocyanin (KLH), and thyroglobulin. The coupled peptide is then used to 

20 immunize the animal (e.g, a mouse, a rat, or a rabbit). 

The term "humanized" antibody refers to antibody molecules in which 
amino acids have been replaced in the non-antigen binding regions, e.g., the 
complementarity determining regions (CDRs), in order to more closely 
resemble a human antibody, while still retaining the original binding capability, 

25 e.g., as described in U.S. Patent No. 5,585,089 to C.L Queen et al. v which is 
a nonlimiting example. Fully humanized antibodies, such as those produced 
transgenically or recombinantly, are also encompassed herein. 

The term "antigenic determinant" refers to that portion of a molecule 
that makes contact with a particular antibody (i.e., an epitope). When a 

30 protein or fragment of a protein is used to immunize a host animal, numerous 
regions of the protein may induce the production of antibodies which bind 
specifically to a given region or three-dimensional structure on the protein; 
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these regions or structures are referred to an antigenic determinants. An 
antigenic determinant may compete with the intact antigen (i.e., the 
immunogen used to elicit the immune response) for binding to an antibody. 
The terms "specific binding" or "specifically binding" refer to the 

5 interaction between a protein or peptide and a binding molecule, such as an 
agonist, an antagonist, or an antibody. The interaction is dependent upon the 
presence of a particular structure (e.g., an antigenic determinant or epitope, or 
a structural determinant) of the protein that is recognized by the binding 
molecule. For example, if an antibody is specific for epitope "A", the presence 

0 of a protein containing epitope A (or free, unlabeled A) in a reaction containing 
labeled "A" and the antibody will reduce the amount of labeled A bound to the 
antibody. 

The term "correlates with expression of a polynucleotide" indicates that 
the detection of the presence of ribonucleic acid that is similar to one or more 

5 of the HDAC sequences provided herein by Northern analysis is indicative of 
the presence of mRNA encoding an HDAC polypeptide in a sample and 
thereby correlates with expression of the transcript from the polynucleotide 
encoding the protein. 

An alteration in the polynucleotide of an HDAC nucleic acid sequence 

0 comprises any alteration in the sequence of the polynucleotides encoding an 
HDAC polypeptide, including deletions, insertions, and point mutations that 
may be detected using hybridization assays. Included within this definition is 
the detection of alterations to the genomic DNA sequence which encodes an 
HDAC polypeptide (e.g., by alterations in the pattern of restriction fragment 

5 length polymorphisms capable of hybridizing to the HDAC nucleic acid 
sequences presented herein, (i.e., SEQ ID NO:1, SEQ ID NO:12, SEQ ID 
NO:19, SEQ ID NO:88, SEQ ID NO:94, and/or SEQ ID NO:96), the inability of 
a selected fragment of a given HDAC sequence to hybridize to a sample of 
genomic DNA (e.g., using allele-specific oligonucleotide probes), and 

) improper or unexpected hybridization, such as hybridization to a locus other 
than the normal chromosomal locus for the polynucleotide sequence encoding 
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an HDAC polypeptide (e.g., using fluorescent in situ hybridization (FISH) to 

metaphase chromosome spreads). 

Description of Embodiments of the Present Invention 

In one of its embodiments, the present invention is directed to a novel 
5 HDAC termed, BMYJHDAL1, which is encoded by the human BAC clones 
AC016186, AC00755 and AC002088. The BMYJHDAL1 nucleic acid (cDNA) 
sequence is provided as SEQ ID NO:1; the BMY_HDAL1 amino acid 
sequence encoded by the BMY_HDAL1 nucleic acid sequence is presented 
asSEQIDNO:2. (FIG. 1). 

10 BMY_HDAL1 was identified by HMM analysis using PFAM model 

PF00850. (Example 1). The PFAM-HMM database is a collection of protein 
families and domains and contains multiple protein alignments (A. Bateman et 
al., 1999, Nucleic Acids Research, 27:260-262). BMY_HDAL1 is most closely 
related to the known human histone deacetylase HDAC5; the two proteins are 

15 71% identical and 77% similar over 105 amino acids, as determined by the 
GCG Gap program with a gap weight of 8 and a length weight of 2. The gene 
structure and predicted cDNA and protein sequence of BMYJHDAL1 were 
determined by comparison to the known human histone deacetylase HDAC5 
using the GenewiseDB program to analyze human BAC AC002088 (E. Birney 

20 and R. Durbin, 2000, Genome Res., 10(4):547-548). 

Sequence motifs of BMY_HDAL1 were examined using the GCG 
Motifs program to ascertain if there were motifs common to other known 
proteins in the PROSITE collection (K. Hofmann et al., 1999, Nucleic Acids 
Res., 27(1):215-219) with no allowed mismatches. Motifs programs typically 

25 search for protein motifs by searching protein sequences for regular- 
expression patterns described in the PROSITE Dictionary. FIG. 4 shows 
PROSITE motifs identified in the partial predicted amino acid sequence of 
BMY_HDAL1. 

In another embodiment, the present invention is directed to the novel 
30 HDAC termed BMYJHDAL2, a novel human histone deacetylase-like protein 
encoded by genomic BACs AC002410. The BMY_HDAL2 nucleic acid 
sequence (SEQ ID NO:12) and its encoded polypeptide (SEQ ID NO:4) are 
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presented in FIG. 5. BMYJHDAL2 was identified by hidden Markov model 
searches using the PFAM HMM PF00850 to search predicted proteins from 
human genomic DNA. BMYJHDAL2 is most closely related to the known 
human histone deacetylase HDAC5; the two proteins are 78% identical and 
5 86% similar over 163 amino acids as determined by the GCG Gap program 
with a gap weight of 8 and a length weight of 2. The gene structure and 
predicted cDNA and protein sequences of BMYJHDAL2 were determined by 
comparison to BMYJHDA5 using the GenewiseDB program (E. Birney and R. 
Durbin, 2000, Genome Res,, 10(4):547-548). 

10 Sequence motifs of BMYJHDAL2 were examined using the GCG 

Motifs program to ascertain if there were motifs in the PROSITE collection (K. 
Hofmann et al., 1999, Nucleic Acids Res., 27(1):215-219) with no allowed 
mismatches. FIG. 7 shows PROSITE motifs identified in the partial predicted 
amino acid sequence of BMYJHDAL2. 

15 In addition, the genomic location surrounding BMY_HDAL2 was 

investigated. Based on the genomic location of BAC AC002410 as reported 
by the NCBI MapViewer, BMYJHDAL2 has been localized to chromosome 7 
region q36. 

In another embodiment, the present invention further provides a third 
20 HDAC termed BMYJHDAL3. The BMYJHDAL3 nucleic acid sequence (SEQ 
ID NO:19) and its encoded polypeptide (SEQ ID NO:5) are presented in FIG. 
1 0. BMYJHDAL3 is encoded by the human genomic BAC clones AC004994 
and AC004744. BMY_HDAL3 was identified by HMM analysis using PFAM 
model PF00850 to search predicted proteins generated from human genomic 
25 DNA sequences using Genscan. BMYJHDAL3 is most closely related to the 
known human histone deacetylase HDAC5; the two proteins are 69% identical 
over 1 122 amino acids as determined by the GCG Gap program with a gap 
weight of 8 and a length weight of 2. 

The partial transcripts identified from BAC clones AC004994 (SEQ ID 
30 NO:15) and AC004744 (SEQ ID NO:16) were assembled into a single contig 
(designated BMYJHDAL3) using the VectorNTI ContigExpress program 
(Informax). (FIG. 9). The gene structure and predicted cDNA and protein 
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sequence of BMY_HDAL3 were determined by comparison to the known 
human histone deacetylase HDAC5 using the GenewiseDB program (K. 
Hofmann et al., 1999, Nucleic Acids Res., 27(1):215-219) and are presented 
in FIG. 9. The most N-terminal region of the BMY_HDAL3 sequence 
5 described herein is encoded by human genomic BAC AC004994. (FIG. 8A). 

BMY_HDAL3 has been localized to chromosome 7, region q36 based 
on the locations reported for AC004994 and by the NCBI Map Viewer. 

Sequence motifs of BMYJHDAL3 were examined using the GCG 
Motifs program to ascertain if there were motifs in the PROSITE collection (K. 

10 Hofmann et al M 1999, Nucleic Acids Res., 27(1):215-219) with no allowed 
mismatches. FIG. 11 shows PROSITE motifs identified in the partial 
predicted amino acid sequence of BMYJHDAL3. FIG. 12 shows a multiple 
sequence alignment of the novel human HDAC, BMY_HDAL3, amino acid 
sequence (SEQ ID NO:5) with the amino acid sequence of AAC78618 (SEQ 

15 ID NO:21) and with the amino acid sequence of AAD1 5364 (SEQ ID NO:22). 
AAC78618 is a histone deacetylase-like protein predicted by genefinding and 
conceptual translation of AC004994 and which was entered in Genbank. 
AAD15364 is a similar predicted protein derived from AC004744 and entered 
in Genbank. AAC78618, AAD15364 and BMYJHDAL3 were aligned using the 

20 ClustalW algorithm as implemented in the VectorNTI sequence analysis 
package (1998, 5.5 Ed., Informax, Inc.) with a gap opening penalty of 10, a 
gap extension penalty of 0.1 and no end gap penalties. 

Novel HDAC9 variants, termed HDACX_v1 and HDACX_v2, have also 
been identified. In addition, HDAC9c, an HDAC9-related family member, has 

25 been newly identified and cloned. 

HDAC Polynucleotides and Polypeptides 

The present invention encompasses novel HDAC nucleic acid 
sequences (e.g., SEQ ID NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID 
NO:88, SEQ ID NO:94, SEQ ID NO:96, and sequences complementary 

30 thereto) encoding newly discovered histone deacetylase like polypeptides 
(e.g., SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID 
NO:93, and SEQ ID NO:95). These HDAC polynucleotides, polypeptides, or 
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compositions thereof, can be used in methods for screening for antagonists or 
inhibitors of the activity or function of HDACs. 

In another of its embodiments, the present invention encompasses new 
HDAC polypeptides comprising the amino acid sequences of, e.g., SEQ ID 
5 NO:2, SEQ ID IMO;4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, and SEQ 
ID NO:95, and as shown in FIG. 1, FIG. 5, FIG. 10, FIGS. 15A-15C, FIGS. 
20A-20C, and FIGS. 21A-21B. 

The HDAC polypeptides as described herein show close similarity to 
HDAC proteins, including HDAC5 and HDAC9. FIGS. 2A and 2B portray the 
10 structural similarities among the novel HDAC polypeptides and several other 
proteins, namely Aquifex HDAL, Human HDAC4, Human HDAC5, Human 
HDAC7, and Saccharomyces cerevisiae HDAL FIGS. 15D-15F show the 
amino acid sequence similarity and identity shared by HDAC9c and previously 
identified HDAC9 amino acid sequences. FIGS. 23A-23K show the 
15 nucleotide sequence identity shared by HDACXjvl, HDACX_v2, and 
previously identified HDAC9 nucleotide sequences. 

Variants of the disclosed HDAC polynucleotides and polypeptides are 
also encompassed by the present invention. In some cases, a HDAC 
polynucleotide variant (i.e., variant of SEQ ID NO:1, SEQ ID NO:12, SEQ ID 
20 NO: 19, SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96) will encode an 
amino acid sequence identical to a HDAC sequence (e.g., SEQ ID NO:2, SEQ 
ID N0:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, and SEQ ID NO:95). 
This is due to the redundancy (degeneracy) of the genetic code, which allows 
for silent mutations. In other cases, a HDAC polynucleotide variant will 
encode a HDAC polypeptide variant (i.e., a variant of SEQ ID NO:2, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, or SEQ ID NO:95). 
Preferably, an HDAC polypeptide variant has at least 75 to 80%, more 
preferably at least 85 to 90%, and even more preferably at least 90% or 
greater amino acid sequence identity to one or more of the HDAC amino acid 
sequences (e.g., SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, 
SEQ ID NO:93, and SEQ ID NO:95) as disclosed herein, and which retains at 
least one biological or other functional characteristic or activity of the HDAC 
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polypeptide. Most preferred is a variant having at least 95% amino acid 
sequence identity to the amino acid sequences set forth in SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, and SEQ ID 
NO:95. 

5 An amino acid sequence variant of the HDAC proteins can be 

categorized into one or more of three classes: substitutional, insertional, or 
deletional variants. Such variants are typically prepared by site-specific 
mutagenesis of nucleotides in the DNA encoding the HDAC protein, using 
cassette or PCR mutagenesis, or other techniques that are well known and 

10 practiced in the art, to produce DNA encoding the variant. Thereafter, the 
DNA is expressed in recombinant cell culture as described herein. Variant 
HDAC protein fragments having up to about 100-150 residues may be 
prepared by in vitro synthesis using conventional techniques. 

Amino acid sequence variants are characterized by the predetermined 

15 nature of the variation, a feature that sets them apart from naturally occurring 
allelic or interspecies variations ^of an HDAC amino acid sequence. The 
vaijants typically exhibit the same qualitative biological activity as that of the 
naturally occurring analogue, although variants can also be selected having 
modified characteristics. While the site or region for introducing an amino 

20 acid sequence variation is predetermined, the mutation per se need not be 
predetermined. For example, in order to optimize the performance of a 
mutation at a given site, random mutagenesis may be performed at the target 
codon or region, and the expressed HDAC variants can be screened for the 
optimal combination of desired activity. Techniques for making substitution 

25 mutations at predetermined sites in DNA having a known sequence are well 
known, for example, M13 primer mutagenesis and PCR mutagenesis. 
Screening of the mutants is accomplished using assays of HDAC protein 
activity, for example, for binding domain mutations, competitive binding 
studies may be carried out. 

30 Amino acid substitutions are typically of single residues; insertions 

usually are on the order of from one to twenty amino acids, although 
considerably larger insertions may be tolerated. Deletions range from about 
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one to about 20 residues, although in some cases, deletions may be much 
larger. 

Substitutions, deletions, insertions, or any combination thereof, may be 
used to arrive at a final HDAC derivative. Generally, these changes affect 
5 only a few amino acids to minimize the alteration of the molecule. However, 
larger changes may be tolerated in certain circumstances. When small 
alterations in the characteristics of the HDAC protein are desired or 
warranted, substitutions are generally made in accordance with the following 
table: 

10 



Original 


Conservative 


Original 


Conservative 


Residue 


Substitution(s) 


Residue 


Substitution(s) 


Ala 


Ser 


Leu 


lie, Val 


Arg 


Lys 


Lys 


Arg, Gin, Glu 


Asn 


Gin, His 


Met 


Leu, He 


Asp 


Glu 


Phe 


Met, Leu, Tyr 


Cys 


Ser 


Ser 


Thr 


Gin 


Asn 


Thr 


Ser 


Glu 


Asp 


Trp 


Tyr 


Gly 


Pro 


Tyr 


Trp, Phe 


His 


Asn, Gin 


Val 


He, Leu 


He 


Leu, Val 







Substantial changes in function or immunological identity are made by 
selecting substitutions that are less conservative than those shown in the 
above Table. For example, substitutions may be made which more 

15 significantly affect the structure of the polypeptide backbone in the area of the 
alteration, for example, the alpha-helical, or beta-sheet structure; the charge 
or hydrophobicity of the molecule at the target site; or the bulk of the side 
chain. The substitutions which generally are expected to produce the greatest 
changes in the polypeptide's properties are those in which (a) a hydrophilic 

20 residue, e.g., seryl or threonyl, is substituted for (or by) a hydrophobic residue, 
e.g., leucyl, isoleucyl, phenylalanyl, valyl, or alanyl; (b) a cysteine or proline is 
substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or 
by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue 



34 



WO 02/102323 



PCT/US02/19560 



having a bulky side chain, e.g., phenylalanine, is substituted for (or by) a 
residue that does not have a side chain, e.g., glycine. 

While HDAC variants will ordinarily exhibit the same qualitative 
biological activity or function, and elicit the same immune response, as the 
5 naturally occurring analogue, the variants are also selected to modify the 
characteristics of HDAC proteins as needed. Alternatively, the variant may be 
designed such the that biological activity of the HDAC protein is altered, e.g., 
improved. 

In another embodiment, the present invention 
10 encompasses polynucleotides that encode the novel HDAC polypeptides 
disclosed herein. Accordingly, any nucleic acid sequence that encodes the 
amino acid sequence of an HDAC polypeptide of the invention can be used to 
produce recombinant molecules that express that HDAC protein. In a 
particular embodiment, the present invention encompasses the novel human 
15 HDAC polynucleotides comprising the nucleic acid sequences of SEQ ID 
NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, and 
SEQ ID NO:96 as shown in FIG. 1, FIG. 5, FIG. 10, FIGS. 15A-15C, FIGS. 
20A-20C, and FIGS. 21A-21B. More particularly, the present invention 
embraces cloned full-length open reading frame human BMY_HDAL1, 
20 BMY_HDAL2 and BMYJHDAL3 deposited at the American Type Culture 
Collection (ATCC), 10801 University Boulevard, Manassas, VA 20110-2209 

on under ATCC Accession No. 

according to the terms of the Budapest 

Treaty. 

25 As will be appreciated by the skilled practitioner in the art, the 

degeneracy of the genetic code results in the production of more than one 
appropriate nucleotide sequence encoding the HDAC polypeptides of the 
present invention. Some of the sequences bear minimal homology to the 
nucleotide sequences of any known and naturally occurring gene. 

30 Accordingly, the present invention contemplates each and every possible 
variation of nucleotide sequence that could be made by selecting 
combinations based on possible codon choices. These combinations are 
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made in accordance with the standard triplet genetic code as applied to the 
nucleotide sequence of a naturally occurring HDAC protein, and all such 
variations are to be considered as being embraced herein. 

Although nucleotide sequences which encode the HDAC polypeptides 
5 and variants thereof are preferably capable of hybridizing to the nucleotide 
sequence of the naturally occurring HDAC polypeptides under appropriately 
selected conditions of stringency, it may be advantageous to produce 
nucleotide sequences encoding the HDAC polypeptides, or derivatives 
thereof, which possess a substantially different codon usage. Codons may be 

10 selected to increase the rate at which expression of the peptide/polypeptide 
occurs in a particular prokaryotic or eukaryotic host in accordance with the 
frequency with which particular codons are utilized by the host, for example, in 
plant cells or yeast cells or amphibian cells. Other reasons for substantially 
altering the nucleotide sequence encoding the HDAC polypeptides, and 

15 derivatives, without altering the encoded amino acid sequences, include the 
production of mRNA transcripts having more desirable properties, such as a 
greater half-life, than transcripts produced from the naturally occurring 
sequence. 

The present invention also encompasses production of DNA 
20 sequences, or portions thereof, which encode the HDAC polypeptides, and 
derivatives of these polypeptides, entirely by synthetic chemistry. After 
production, the synthetic sequence may be inserted into any of the many 
available expression vectors and cell systems using reagents that are well 
known and practiced by those in the art. Moreover, synthetic chemistry may 
be used to introduce mutations into a sequence encoding an HDAC 
polypeptide, or any fragment thereof. 

Also encompassed by the present invention are polynucleotide 
sequences that are capable of hybridizing to the HDAC nucleotide sequences 
presented herein, such as those shown in SEQ ID NO:1 , SEQ ID NO:12, SEQ 
ID NO:19, SEQ ID NO:88, SEQ ID NO:94, and SEQ ID NO:96, or sequences 
complementary thereto, under various conditions of stringency. Hybridization 
conditions are typically based on the melting temperature (Tm) of the nucleic 
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acid binding complex or probe (See, G.M. Wahl and S.L. Berger, 1987; 
Methods EnzymoL, 152:399-407 and A.R. Kimmel, 1987; Methods of 
EnzymoL, 152:507-511), and may be used at a defined stringency. For 
example, included in the present invention are sequences capable of 
5 hybridizing under moderately stringent conditions to the HDAC nucleic acid 
sequences of SEQ ID NO:1, SEQ ID NO:12, or SEQ ID NO:19, SEQ ID 
NO:88, SEQ ID NO:94, and SEQ ID NO:96, and other sequences which are 
degenerate to those which encode the HDAC polypeptides (e.g., as a 
nonlimiting example: prewashing solution of 2 X SSC, 0.5% SDS, I.OmM 

10 EDTA, pH 8.0, and hybridization conditions of 50°C, 5 X SSC, overnight). 

In another embodiment of the present invention, polynucleotide 
sequences or fragments (peptides) thereof which encode the HDAC 
polypeptide may be used in recombinant DNA molecules to direct the 
expression of the HDAC polypeptide products, or fragments or functional 

15 equivalents thereof, in appropriate host cells. Because of the inherent 
degeneracy of the genetic code, other DNA sequences, which encode 
substantially the same or a functionally equivalent amino acid sequences, 
may be produced, and these sequences may be used to express recombinant 
HDAC polypeptides. 

20 As will be appreciated by those having skill in the art, it may be 

advantageous to produce HDAC polypeptide-encoding nucleotide sequences 
possessing non-naturally occurring codons. For example, codons preferred 
by a particular prokaryotic or eukaryotic host can be selected to increase the 
rate of protein expression or to produce a recombinant RNA transcript having 

25 desirable properties, such as a half-life which is longer than that of a transcript 
generated from the naturally occurring sequence. 

The nucleotide sequences of the present invention can be engineered 
using methods generally known in the art in order to alter HDAC polypeptide- 
encoding sequences for a variety of reasons, including, but not limited to, 

30 alterations which modify the cloning, processing, and/or expression of the 
gene products. DNA shuffling by random fragmentation and PCR reassembly 
of gene fragments and synthetic oligonucleotides may be used to engineer 
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the nucleotide sequences. For example, site-directed mutagenesis may be 
used to insert new restriction sites, alter glycosylation patterns, change codon 
preference, produce splice variants, or introduce mutations, and the like. 

In another embodiment of the present invention, natural, modified, or 
5 recombinant nucleic acid sequences, or a fragment thereof, encoding the 
HDAC polypeptides may be ligated to a heterologous sequence to encode a 
fusion protein. For example, for screening peptide libraries for inhibitors or 
modulators of HDAC activity or binding, it may be useful to encode a chimeric 
HDAC protein or peptide that can be recognized by a commercially available 

10 antibody. A fusion protein may also be engineered to contain a cleavage site 
located between an HDAC protein-encoding sequence and the heterologous 
protein sequence, so that the HDAC protein may be cleaved and purified 
away from the heterologous moiety. 

In another embodiment, ligand-binding assays are useful to identify 

1 5 inhibitor or antagonist compounds that interfere with the function of the HDAC 
protein, or activator compounds that stimulate the function of the 
HDAC protein. Preferred are inhibitor or antagonist compounds. Such 
assays are useful even if the function of a protein is not known. These assays 
are designed to detect binding of test compounds (i.e., test agents) to 

20 particular target molecules, e.g., proteins or peptides. The detection may 
involve direct measurement of binding. - Alternatively, indirect indications of 
binding may involve stabilization of protein structure, or disruption or 
enhancement of a biological function. Non-limiting examples of useful ligand- 
binding assays are detailed below. 

25 One useful method for the detection and isolation of binding proteins is 

the Biomolecular Interaction Assay (BIAcore) system developed by 
Pharmacia Biosensor and described in the manufacturer's protocol (LKB 
Pharmacia, Sweden). The BIAcore system uses an affinity purified anti-GST 
antibody to immobilize GST-fusion proteins onto a sensor chip. The sensor 

30 utilizes surface plasmon resonance, which is an optical phenomenon that 
detects changes in refractive indices. Accordingly, a protein of interest, e.g., 
an HDAC polypeptide, or fragment thereof, of the present invention, is coated 
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onto a chip and test compounds (i.e., test agents) are passed over the chip. 
Binding is detected by a change in the refractive index (surface plasmon 
resonance). 

A different type of ligand-binding assay involves scintillation proximity assays 
5 (SPA), as described in U.S. Patent No. 4,568,649. In a modification of this 
assay currently undergoing development, chaperonins are used to distinguish 
folded and unfolded proteins. A tagged protein is attached to SPA beads, and 
test compounds are added. The bead is then subjected to mild denaturing 
conditions, such as, for example, heat, exposure to SDS, and the like, and a 

10 purified labeled chaperonin is added. If a test compound (i.e., test agent) has 
bound to a target protein, the labeled chaperonin will not bind; conversely, if 
no test compound has bound, the protein will undergo some degree of 
denaturation and the chaperonin will bind. In another type of ligand binding 
assay, proteins containing mitochondrial targeting signals are imported into 

15 'isolated mitochondria In vitro (Hurt et al., 1985, EMBO J., 4:2061-2068; Eilers 
and Schatz, 1986, Nature, 322:228-231). 

In a mitochondrial import assay, expression vectors are constructed in which 
nucleic acids encoding particular target proteins are inserted downstream of 
sequences encoding mitochondrial import signals. The chimeric proteins are 

20 synthesized and tested for their ability to be imported into isolated 
mitochondria in the absence and presence of test compounds. A test 
compound that binds to the target protein should inhibit its uptake into isolated 
mitochondria in vitro. 

Another type of ligand-binding assay suitable for use according to the 

25 present invention is the yeast two-hybrid system (Fields and Song, 1989, 
Nature, 340:245-246). The yeast two-hybrid system takes advantage of the 
properties of the GAL4 protein of the yeast S. cerevisiae. The GAL4 protein is 
a transcriptional activator required for the expression of genes encoding 
enzymes involving the utilization of galactose. GAL4 protein consists of two 

30 separable and functionally essential domains: an N-terminal domain, which 
binds to specific DNA sequences (UASG); and a C-terminal domain 
containing acidic regions, which is necessary to activate transcription. The 
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native GAL4 protein, containing both domains, is a potent activator of 
transcription when yeast ceils are grown on galactose medium. The N- 
terminal domain binds to DNA in a sequence-specific manner but is unable to 
activate transcription. The C-terminal domain contains the activating regions 
5 but cannot activate transcription because it fails to be localized to UASG. In 
the two-hybrid system, a system of two hybrid proteins containing parts of 
GAL4: (1) a GAL4 DNA-binding domain fused to a protein X, and (2) a GAL4 
activation region fused to a protein T. If X and Y can form a protein-protein 
complex and reconstitute proximity of the GAL4 domains, transcription of a 

10 gene regulated by UASG occurs. Creation of two hybrid proteins, each 
containing one of the interacting proteins X and Y, allows the activation region 
of UASG to be brought to its normal site of action. 

The binding assay described in Fodor et al., 1991, Science, 251:767- 
773, which involves testing the binding affinity of test compounds for a 

15 plurality of defined polymers synthesized on a solid substrate, may also be 
useful. Compounds that bind to an HDAC polypeptide, or portions thereof, 
according to this invention are potentially useful as agents for use in 
therapeutic compositions. 

In another embodiment, sequences encoding an HDAC polypeptide 

20 may be synthesized in whole, or in part, using chemical methods well known 
in the art (See, for example, M.H. Caruthers et al., 1980, Nucl. Acids Res. 
Symp. Sen, 215-223 and T. Horn, T et al., 1980, Nucl. Acids Res. Symp. Ser., 
225-232). Alternatively, an HDAC protein or peptide itself may be produced 
using chemical methods to synthesize the amino acid sequence of the HDAC 

25 polypeptide or peptide, or a fragment or portion thereof. For example, peptide 
synthesis can be performed using various solid-phase techniques (J.Y. 
Roberge et al., 1995, Science, 269:202-204) and automated synthesis may 
be achieved, for example, using the ABI 431 A Peptide Synthesizer (PE 
Biosystems). 

30 The newly synthesized peptide can be substantially purified by 

preparative high performance liquid chromatography (e.g., T. Creighton, 1983, 
Proteins, Structures and Molecular Principles, WH Freeman and Co., New 
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York, N.Y), by reversed-phase high performance liquid chromatography, or 
other purification methods as are known in the art. The composition of the 
synthetic peptides may be confirmed by amino acid analysis or sequencing 
(e.g., the Edman degradation procedure; Creighton, supra). In addition, the 
5 amino acid sequence of an HDAC polypeptide, peptide, or any portion 
thereof, may be altered during direct synthesis and/or combined using 
chemical methods with sequences from other proteins, or any part thereof, to 
produce a variant polypeptide. 
Expression of Human HDAC Proteins 
10 To express a biologically active / functional HDAC polypeptide or 

peptide, the nucleotide sequences encoding the HDAC polypeptides, or 
functional equivalents, may be inserted into an appropriate expression vector, 
i.e., a vector which contains the necessary elements for the transcription and 
translation of the inserted coding sequence. Methods that are well known to 
15 and practiced by those skilled in the art may be used to construct expression 
vectors containing sequences encoding an HDAC polypeptide or peptide and 
appropriate transcriptional and translational control elements. These methods 
include in vitro recombinant DNA techniques, synthetic techniques, and in 
vivo genetic recombination. Such techniques are described in J. Sambrook et 
al., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, 
Plainview, N.Y. and in F.M. Ausubel et aL, 1989, Current Protocols in 
Molecular Biology, John Wiley & Sons, New York, N.Y. 

A variety of expression vector/host systems may be utilized to contain 
and express sequences encoding an HDAC polypeptide or peptide. Such 
expression vector/host systems include, but are not limited to, 
microorganisms such as bacteria transformed with recombinant 
bacteriophage, plasmid, or cosmid DNA expression vectors; yeast or fungi 
transformed with yeast or fungal expression vectors; insect cell systems 
infected with virus expression vectors (e.g., baculovirus); plant cell systems 
transformed with virus expression vectors (e.g., cauliflower mosaic virus 
(CaMV) and tobacco mosaic virus (TMV)), or with bacterial expression vectors 
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(e.g., Ti or pBR322 plasmids); or animal cell systems. The host cell employed 
is not limiting to the present invention. 

"Control elements" or "regulatory sequences" are those non-translated 
regions of the vector, e.g., enhancers, promoters, 5' and 3' untranslated 
5 regions, which interact with host cellular proteins to carry out transcription and 
translation. Such elements may vary in their strength and specificity. 
Depending on the vector system and host utilized, any number of suitable 
transcription and translation elements, including constitutive and inducible 
promoters, may be used. For example, when cloning in bacterial systems, 
10 inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT 
phagemid (Stratagene, La Jolla, CA) or PSPORT1 plasmid (Life 
Technologies), and the like, may be used. The baculovirus polyhedrin 
promoter may be used in insect cells. Promoters or enhancers derived from 
the genomes of plant cells (e.g., heat shock, RUBISCO; and storage protein 
15 genes), or from plant viruses (e.g., viral promoters or leader sequences), may 
be cloned into the vector. In mammalian cell systems, promoters from 
mammalian genes or from mammalian viruses are preferred. If it is necessary 
to generate a cell line that contains multiple copies of the sequence encoding 
an HDAC polypeptide or peptide, vectors based on SV40 or EBV may be 
used with an appropriate selectable marker. 

- In bacterial systems, a number of expression vectors may be selected, 
depending upon the use intended for the expressed HDAC product. For 
example, when large quantities of expressed protein are needed for the 
induction of antibodies, vectors that direct high level expression of fusion 
proteins that are readily purified may be used. Such vectors include, but are 
not limited to, the multifunctional E. coli cloning and expression vectors such 
as BLUESCRIPT (Stratagene), in which the sequence encoding an HDAC 
polypeptide, or peptide, may be ligated into the vector in-frame with 
sequences for the amino-terminal Met and the subsequent 7 residues of 3- 
galactosidase, so that a hybrid protein is produced; pIN vectors (See, G. Van 
Heeke and S.M. Schuster, 1989, J. Biol. Chem., 264:5503-5509); and the like. 
pGEX vectors (Promega, Madison, Wl) may also be used to express foreign 
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polypeptides, as fusion proteins with glutathione S-transferase (GST). In 
general, such fusion proteins are soluble and can be easily purified from lysed 
cells by adsorption to glutathione-agarose beads followed by elution in the 
presence of free glutathione. Proteins made in such systems may be 
5 designed to include heparin, thrombin, or factor XA protease cleavage sites 
so that the cloned polypeptide of interest can be released from the GST 
moiety at will. 

In the yeast, Saccharomyces cerevisiae, a number of vectors 
containing constitutive or inducible promoters such as alpha factor, alcohol 

10 oxidase, and PGH may be used. (For reviews, see F.M. Ausubel et al., supra, 
and Grant et al., 1987, Methods Enzymol., 153:516-544). 

Should plant expression vectors be desired and used, the expression 
of sequences encoding an HDAC polypeptide or peptide may be driven by 
any of a number of promoters. For example, viral promoters such as the 35S 

15 and 19S promoters of CaMV may be used alone or in combination with the 
omega leader sequence from TMV (N. Takamatsu, 1987, EMBO J., 6:307- 
311). Alternatively, plant promoters such as the small subunit of RUBISCO, 
or heat shock promoters, may be used (G. Coruzzi et al., 1984, EMBO J., 
3:1671-1680; R. Broglie et al., 1984, Science, 224:838-843; and J. Winter et 

20 al., 1991, Results Probl. Cell Differ. 17:85-105). These constructs can be 
introduced into plant cells by direct DNA transformation or pathogen-mediated 
transfection. Such techniques are described in a number of generally 
available reviews (See, for example, S. Hobbs or L.E. Murry, In: McGraw Hill 
Yearbook of Science and Technology (1992) McGraw Hill, New York, N.Y.; 

25 pp. 191-196). 

An insect system may also be used to express an HDAC polypeptide 
or peptide. For example, in one such system, Autographs californica nuclear 
polyhedrosis virus (AcNPV) is used as a vector to express foreign genes in 
Spodoptera frugiperda cells or in Trichoplusia larvae. The sequences 

30 encoding an HDAC polypeptide or peptide may be cloned into a non-essential 
region of the virus such as the polyhedrin gene and placed under control of 
the polyhedrin promoter. Successful insertion of the HDAC polypeptide or 
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peptide will render the polyhedrin gene inactive and produce recombinant 
virus lacking coat protein. The recombinant viruses may then be used to 
infect, for example, S. frugiperda cells or Trichoplusia larvae in which the 
HDAC polypeptide or peptide product may be expressed (E.K. Engelhard et 
5 al., 1994, Proa Nat Acad ScL, 91 :3224-3227). 

In mammalian host cells, a number of viral-based expression systems 
may be utilized. In cases where an adenovirus is used as an expression 
vector, sequences encoding an HDAC polypeptide or peptide may be ligated 
into an adenovirus transcription/translation complex containing the late 

10 promoter and tripartite leader sequence. Insertion in a non-essential E1 or E3 
region of the viral genome may be used to obtain a viable virus which is 
capable of expressing the HDAC polypeptide or peptide in infected host cells 
(J. Logan and T. Shenk, 1984, Proc. Natl. Acad Sc/., 81:3655-3659). In 
addition, transcription enhancers, such as the Rous sarcoma virus (RSV) 

15 enhancer, may be used to increase expression in mammalian host cells. 

Specific initiation signals may also be used to achieve more efficient 
translation of sequences encoding an HDC polypeptide or peptide. Such 
signals include the ATG initiation codon and adjacent sequences. In cases 
where sequences encoding an HDAC polypeptide or peptide, its initiation 

20 codon, and upstream sequences are inserted into the appropriate expression 
* vector, no additional transcriptional or translational control signals may be 
needed. However, in cases where only coding sequence, or a fragment 
thereof, is inserted, exogenous translational control signals, including the ATG 
initiation codon, should be provided. Furthermore, the Initiation codon should 

25 be in the correct reading frame to ensure translation of the entire insert. 
Exogenous translational elements and initiation codons may be of various 
origins, both natural and synthetic. The efficiency of expression may be 
enhanced by the inclusion of enhancers which are appropriate for the 
particular cell system that is used, such as those described in the literature (D. 

30 Scharf et al., 1994, Results Probl. Cell Differ., 20:125-162). 

Moreover, a host cell strain may be chosen for its ability to modulate 
the expression of the inserted sequences or to process the expressed protein 
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in the desired fashion. Such modifications of the polypeptide include, but are 
not limited to, acetylation, carboxylation, glycosylation, phosphorylation, 
lipidation, and acylation. Post-translational processing which cleaves a 
"prepro" form of the protein may also be used to facilitate correct insertion, 

5 folding and/or function. Different host cells having specific cellular machinery 
and characteristic mechanisms for such post-translational activities (e.g., 
COS, CHO, HeLa, MDCK, HEK293, and W138) are available from the 
American Type Culture Collection (ATCC), American Type Culture Collection 
(ATCC), 10801 University Boulevard, Manassas, VA 20110-2209, and may 

0 be chosen to ensure the correct modification and processing of the foreign 
protein. 

For long-term, high-yield production of recombinant proteins, stable 
expression is preferred. For example, cell lines which stably express an 
HDAC protein may be transformed using expression vectors which may 

5 contain viral origins of replication and/or endogenous expression elements 
and a selectable marker gene on the same, or on a separate, vector. 
Following the introduction of the vector, cells may be allowed to grow for 1-2 
days in an enriched cell culture medium before they are switched to selective 
medium. The purpose of the selectable marker is to confer resistance to 

0 selection, and its presence allows the growth and recovery of cells that 
successfully express the introduced sequences. Resistant clones of stably 
transformed cells may be proliferated using tissue culture techniques 
appropriate to the cell type. 

Any number of selection systems may be used to recover transformed 

5 cell lines. These include, but are not limited to, the Herpes Simplex Virus 
thymidine kinase (HSV TK), (M. Wigler et al, 1977, Cell, 11:223-32) and 
adenine phosphoribosyltransferase (I. Lowy et al., 1980, Cell, 22:817-23) 
genes which can be employed in tk" or aprt" cells, respectively. Also, anti- 
metabolite, antibiotic or herbicide resistance can be used as the basis for 

3 selection; for example, dhfr, which confers resistance to methotrexate (M. 
Wigler et al., 1980, Proc. Natl. Acad Sc/., 77:3567-70); npt, which confers 
resistance to the aminoglycosides neomycin and G-418 (F. Colbere-Garapin 
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et ai., 1981, J. Mol. Biol., 150:1-14); and als or pat, which confer resistance to 
chlorsulfuron and phosphinotricin acetyltransferase, respectively (Murry, 
supra). Additional selectable genes have been described, for example, trpB, 
which allows cells to utilize indole in place of tryptophan, or hisD, which allows 
5 cells to utilize histinol in place of histidine (S.C. Hartman and R.C. Mulligan, 
1988, Proc. Natl. Acad. ScL, 85:8047-51). Recently, the use of visible 
markers has gained popularity with such markers as the anthocyanins, (3- 
glucuronidase and its substrate GUS, and luciferase and its substrate 
luciferin, which are widely used not only to identify transfonmants, but also to 

10 quantify the amount of transient or stable protein expression that is 
attributable to a specific vector system (C.A. Rhodes et al., 1995, Methods 
Mol. Biol., 55:121-131). 

Although the presence/absence of marker gene expression suggests 
that the gene of interest is also present, the presence and expression of the 

15 desired gene of interest may need to be confirmed. For example, if an HDAC 
nucleic acid sequence is inserted within a marker gene sequence, 
recombinant cells containing sequences encoding the HDAC polypeptide or 
peptide can be identified by the absence of marker gene function. 
Alternatively, a marker gene can be placed in tandem with a sequence 

20 encoding an HDAC polypeptide or peptide under the control of a single 
promoter. Expression of the marker gene in response to induction or 
selection usually indicates co-expression of the tandem gene. 

Alternatively, host cells which contain the nucleic acid sequence 
encoding an HDAC polypeptide or peptide and which express the HDAC 

25 product may be identified by a variety of procedures known to those having 
skill in the art. These procedures include, but are not limited to, DNA-DNA or 
DNA-RNA hybridizations and protein bioassay or immunoassay techniques, 
including membrane, solution, or chip based technologies, for the detection 
and/or quantification of nucleic acid or protein. 

30 Preferably, the HDAC polypeptide or peptide of this invention is 

substantially purified after expression. HDAC proteins and peptides can be 
isolated or purified in a variety of ways known to and practiced by those 
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having skill in the art, depending on what other components may be present in 
the sample. Standard purification methods include electrophoretic, molecular, 
immunological and chromatographic techniques, including, but not limited to, 
ion exchange, hydrophobic affinity and reverse phase HPLC chromatography, 
5 and chromatofocusing. For example, an HDAC protein or peptide can be 
purified using a standard anti-HDAC antibody column. Ultrafiltration and 
diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see R. 
Scopes, 1982, Protein Purification, Springer-Verlag, NY. As will be 

10 understood by the skilled practitioner, the degree of purification necessary will 
vary depending on the intended use of the HDAC protein or peptide; in some 
instances, no purification will be necessary. 

In addition to recombinant production, fragments of an HDAC 
polypeptide or peptide may be produced by direct peptide synthesis using 

15 solid-phase techniques (J. Merrifield, 1963, J. Am. Chem. Soc., 85:2149- 
2154). Protein synthesis may be performed using manual techniques or by 
automation. Automated synthesis may be achieved, for example, using ABI 
431 A Peptide Synthesizer (PE Biosystems). If desired, various fragments of 
an HDAC polypeptide can be chemically synthesized separately and then 

20 combined using chemical methods to produce the full length molecule. 
Detection of Human HDAC Polynucleotide 

The presence of polynucleotide sequences encoding an HDAC 
polypeptide or this invention can be detected by DNA-DNA or DNA-RNA 
hybridization, or by amplification using probes or portions or fragments of 

25 polynucleotides encoding the HDAC polypeptide. Nucleic acid amplification 
based assays involve the use of oligonucleotides or oligomers, based on the 
sequences encoding a particular HDAC polypeptide or peptide, to detect 
transformants containing DNA or RNA encoding an HDAC polypeptide or 
peptide. 

30 A wide variety of labels and conjugation techniques are known and 

employed by those skilled in the art and may be used in various nucleic acid 
and amino acid assays. Means for producing labeled hybridization or PGR 
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probes for detecting sequences related to polynucleotides encoding an HDAC 
polypeptide or peptide include oligo-labeling, nick translation, end-labeling, or 
PCR amplification using a labeled nucleotide. Alternatively, the sequences 
encoding an HDAC polypeptide, or any portions or fragments thereof, may be 
5 cloned into a vector for the production of an mRNA probe. Such vectors are 
known in the art, are commercially available, and may be used to synthesize 
RNA probes in vitro by addition of an appropriate RNA polymerase, such as 
T7, T3, or SP(6) and labeled nucleotides. These procedures may be 
conducted using a variety of commercially available kits (e.g., Amersham 

10 Pharmacia Biotech, Promega and U.S. Biochemical Corp.). 

Suitable reporter molecules or labels which may be used include 
radionucleotides, enzymes, fluorescent, chemiluminescent, or chromogenic 
agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the 
like. Non-limiting examples of labels include radioisotopes, such as 3 H, 14 C, 

15 and ^P, and non-radioactive molecules, such as digoxigenin. In addition, 
nucleic acid molecules may be modified using known techniques, for 
example, using RNA or DNA analogs, phosphorylation, dephosphorylation, 
methylation, or demethylation. 

Human HDAC Polypeptides - Production, Detection. Isolation 
20 Host cells transformed with nucleotide sequences encoding an HDAC 

protein or peptide, or fragments thereof, may be cultured under conditions 
suitable for the expression and recovery of the protein from cell culture. The 
protein produced by a recombinant cell may be secreted or contained 
intracellular^ depending on the sequence and/or the vector used. As will be 
25 understood by those having skill in the art, expression vectors containing 
polynucleotides which encode an HDAC protein or peptide may be designed 
to contain signal sequences that direct secretion of the HDAC protein or 
peptide through a prokaryotic or eukaryotic cell membrane. 

Other constructions may be used to join nucleic acid sequences 
30 encoding an HDAC protein or peptide to a nucleotide sequence encoding a 
polypeptide domain that will facilitate purification of soluble proteins. Such 
purification facilitating domains include, but are not limited to, metal chelating 
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peptides such as histidine-tryptophan modules that allow purification on 
immobilized metals; protein A domains that allow purification on immobilized 
immunoglobulin; and the domain utilized in the FLAGS extension/affinity 
purification system (Immunex Corp., Seattle, WA). The inclusion of cleavable 
5 linker sequences such as those specific for Factor XA or enterokinase 
(Invitrogen, San Diego, CA) between the purification domain and the HDAC 
protein or peptide may be used to facilitate purification. One such expression 
vector provides for expression of a fusion protein containing HDAC-encoding 
sequence and a nucleic acid encoding 6 histidine residues preceding a 
10 thioredoxin or an enterokinase cleavage site. The histidine residues facilitate 
purification on IMAC (immobilized metal ion affinity chromatography) as 
described by J. Porath et al., 1992, Prot Exp. Purify 3:263-281, while the 
enterokinase cleavage site provides a means for purifying from the fusion 
protein. For a discussion of suitable vectors for fusion protein production, see 
D.J. Kroll et al., 1993; DNA Cell Biol., 12:441-453. 

Human artificial chromosomes (HACs) may be used to deliver larger 
fragments of DNA than can be contained and expressed in a plasmid vector. 
HACs are linear microchromosomes which may contain DNA sequences of 
10K to 10M in size, and contain all of the elements that are required for stable 
mitotic chromosome segregation and maintenance (See, J.J. Harrington et al., 
1997, Nature Genet, 15:345-355). HACs of 6 to 10M are constructed and 
delivered via conventional delivery methods (e.g., liposomes, polycationic 
amino polymers, or vesicles) for therapeutic purposes. 

A variety of protocols for detecting and measuring the expression of an 
HDAC polypeptide using either polyclonal or monoclonal antibodies specific 
for the protein are known and practiced in the art. Examples include enzyme- 
linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and 
fluorescence activated cell sorting (FACS). A two-site, monoclonal-based 
immunoassay utilizing monoclonal antibodies reactive with two non-interfering 
epitopes on the HDAC polypeptide is preferred, but a competitive binding 
assay may also be employed. These and other assays are described in the 
art as represented by the publication of R. Hampton et al., 1990; Serological 
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Methods, a Laboratory Manual, APS Press, St Paul, MN and D.E. Maddox et 
al., 1983; J. Exp. Med, 158:1211-1216). 

For use with these assays, amino acid sequences (e.g., polypeptides, 
peptides, antibodies, or antibody fragments) may be attached to a label 
5 capable of providing a detectable signal, either directly or indirectly, including, 
but not limited to, radioisotope, fluorescent, and enzyme labels. Fluorescent 
labels include, for example, Cy3, Cy5, Alexa, BODIPY, fluorescein (e.g., 
FluorX, DTAF, and FITC), itiodamine (e.g., TRITC), auramine, Texas Red, 
AMCA blue, and Lucifer Yellow. Preferred isotope labels include 3 H, 14 C, 32 P, 

10 35 S, 36 CI, 51 Cr, 57 Co, 58 Co, 59 Fe, 90 Y, 126 l, 131 l, and 186 Re. Preferred enzyme 
labels include peroxidase, p-glucuronidase, p-D-glucosidase, p-D- 
galactosidase, urease, glucose oxidase plus peroxidase, and alkaline 
phosphatase (see, e.g., U.S. Pat. Nos. 3,654,090; 3,850,752 and 4,016,043). 
Enzymes can be conjugated by reaction with bridging molecules such as 

15 carbodiimides, diisocyanates, glutaraldehyde, and the like. Enzyme labels 
can be detected visually, or measured by calorimetric, spectrophotometry, 
fluorospectrophotometric, amperometric, or gasometric techniques. Other 
labeling systems, such as avidin/biotin, Tyramide Signal Amplification 
(TSA™), are known in the art, and are commercially available (see, e.g., ABC 

20 kit, Vector Laboratories, Inc., Burlingame, CA; NEN® Life Science Products, 
Inc., Boston, MA). 

A compound that interacts with a histone deacetylase according to the 
present invention may be one that is a substrate for the enzyme, one that 
binds the enzyme at its active site, or one that otherwise acts to alter enzyme 

25 activity by binding to an alternate site. A substrate may be acetylated 
histones, or a labeled acetylated peptide fragment derived therefrom, such as 
AcGly-Ala-Lys,(.epsilon.-Ac)-Arg-His-Arg-Lys,(.epsilon.-Ac)-ValNH 2 , or other 
synthetic or naturally occurring substrates. Examples of compounds that bind 
to histone deacetylase are known inhibitors such as n-butyrate, trichostatin, 

30 trapoxin and SAHA (S. Swendeman et al., 1999, Cancer Res., 59(17):4392- 
4399). The compound that interacts with a histone deacetylase is preferably 
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labeled to allow easy quantification of the level of interaction between the 
compound and the enzyme. A preferred radiolabel is tritium. 

The test compound (i.e., test agent) may be a synthetic compound, a 
purified preparation, crude preparation, or an initial extract of a natural product 
5 obtained from plant, microorganism or animal sources. 

One aspect of the present method is based on test compound- induced 
inhibition of histone deacetylase activity. The enzyme inhibition assay 
involves adding histone deacetylase or an extract containing histone 
deacetylase to mixtures of an enzyme substrate and the test compound, both 

10 of which are present in known concentrations. The amount of the enzyme is 
chosen such that approximately 20% of the substrate is consumed during the 
assay. The assay is carried out with the test compound at a series of different 
dilution levels. After a period of incubation, the labeled portion of the 
substrate released by enzymatic action is separated and counted. The assay 

15 is generally carried out in parallel with a negative control (i.e., no test 
compound) and a positive control (i.e., containing a known enzyme inhibitor 
instead of a test compound). The concentration of the test compound at 
which 50% of the enzyme activity is inhibited (IC 5 o) is determined using art 
recognized method. 

20 Although enzyme inhibition is the most direct measure of the inhibitory 

activity of the test compound, results obtained from a competitive binding 
assay in which the test compound competes with a known inhibitor for binding 
to the enzyme active site correlate well with the results obtained from enzyme 
inhibition assay described above. The binding assay represents a more 

25 convenient way to assess enzyme inhibition, because it allows the use of a 
crude extract containing histone deacetylase rather than partially purified 
enzyme. The use of a crude extract may not always be suitable in the 
enzyme inhibition assay because other enzymes present in the extract may 
act on the histone deacetylase substrate. 

30 The competition binding assay is carried out by adding a histone 

deacetylase, or an extract containing histone deacetylase activity, to a mixture 
of the test compound and a labeled inhibitor, both of which are present in the 
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mixture in known concentrations. After incubation, the enzyme-inhibitor 
complex is separated from the unbound labeled inhibitors and unlabeled test 
compound, and counted. The concentration of the test compound required to 
inhibit 50% of the binding of the labeled inhibitor to the histone deacetylase 
5 (IC 5 o) is calculated. 

In one method suitable for this invention, the IC 5 o of test compounds 
against host histone deacetylase is determined using either the enzyme 
inhibition assay or the binding assay as described above, to identify those 
compounds that have selectivity for a particular type of histone deacetylase 

10 over that of a host. 

Anti-Human HDAC Antibodies and Uses Thereof 

Antagonists or inhibitors of the HDAC polypeptides of the present 
invention may be produced using methods that are generally known in the art. 
In particular, purified HDAC polypeptides or peptides, or fragments thereof, 

15 can be used to produce antibodies, or to screen libraries of pharmaceutical 
agents or other compounds, particularly, small molecules, to identify those 
which specifically bind to the novel HDACs of this invention. 

Antibodies specific for an HDAC polypeptide, or immunogenic peptide 
fragments thereof, can be generated using methods that have long been 

20 known and conventionally practiced in the art. Such antibodies may include, 
but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab 
fragments, and fragments produced by an Fab expression library. 
Neutralizing antibodies, (i.e., those which inhibit dimer formation) are 
especially preferred for therapeutic use. 

25 For the production of antibodies, various hosts including goats, rabbits, 

sheep, rats, mice, humans, and others, can be immunized by injection with 
HDAC polypeptide, or any peptide fragment or oligopeptide thereof, which has 
immunogenic properties. Depending on the host species, various adjuvants 
may be used to increase the immunological response. Nonlimiting examples 

30 of suitable adjuvants include Freund's (incomplete), mineral gels such as 
aluminum hydroxide or silica, and surface active substances such as 
lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, KLH, and 
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dinitrophenol. Adjuvants typically used in humans include BCG (bacilli 
Calmette Guerin) and Corynebacterium parvumn. 

Preferably, the peptides, fragments, or oligopeptides used to induce 
antibodies to HDAC polypeptides (i.e., immunogens) have an amino acid 
5 sequence having at least five amino acids, and more preferably, at least 7-10 
amino acids. It is also preferable that the immunogens are identical to a 
portion of the amino acid sequence of the natural protein; they may also 
contain the entire amino acid sequence of a small, naturally occurring 
molecule. The peptides, fragments or oligopeptides may comprise a single 

10 epitope or antigenic determinant or multiple epitopes. Short stretches of 
HDAC amino acids may be fused with those of another protein, such as KLH, 
and antibodies are produced against the chimeric molecule. 

Monoclonal antibodies to HDAC polypeptides, or immunogenic 
fragments thereof, may be prepared using any technique which provides for 

15 the production of antibody molecules by continuous cell lines in culture. 
These include, but are not limited to, the hybridoma technique, the human B- 
cell hybridoma technique, and the EBV-hybridoma technique (G. Kohler et al., 
1975, Nature, 256:495-497; D. Kozbor et al., 1985, J. Immunol. Methods, 
81:31-42; R.J. Cote et al., 1983, Proc. Natl. Acad. Sci. USA, 80:2026-2030; 

20 and S.P. Cole et al., 1984, Mol. Cell Biol, 62:109-120). The production of 
monoclonal antibodies is well known and routinely used in the art. 

In addition, techniques developed for the production of "chimeric 
antibodies," the splicing of mouse antibody genes to human antibody genes to 
obtain a molecule with appropriate antigen specificity and biological activity 

25 can be used (S.L. Morrison et al., 1984, Proc. Natl. Acad. Sci. USA, 81:6851- 
6855; M.S. Neuberger et al., 1984, Nature, 312:604-608; and S. Takeda et al., 
1985, Nature, 314:452-454). Alternatively, techniques described for the 
production of single chain antibodies may be adapted, using methods known 
in the art, to produce HDAC polypeptide- or peptide-specific single chain 

30 antibodies. Antibodies with related specificity, but of distinct idiotypic 
composition, may be generated by chain shuffling from random combinatorial 
immunoglobulin libraries (D.R. Burton, 1991, Proc. Natl. Acad. Sci. USA, 
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88:1 1 120-3). Antibodies may also be produced by inducing in vivo production 
in the lymphocyte population or by screening recombinant immunoglobulin 
libraries or panels of highly specific binding reagents as disclosed in the 
literature (R. Orlandi et al., 1989, Proc. Natl. Acad Sci. USA, 86:3833-3837 
5 and G. Winter et al., 1991 , Nature, 349:293-299). 

Antibody fragments that contain specific binding sites for an HDAC 
polypeptide or peptide may also be generated. For example, such fragments 
include, but are not limited to, F(ab')2 fragments which can be produced by 
pepsin digestion of the antibody molecule and Fab fragments which can be 

10 generated by reducing the disulfide bridges of the F(ab') 2 fragments. 
Alternatively, Fab expression libraries may be constructed to allow rapid and 
easy identification of monoclonal Fab fragments with the desired specificity 
(W.D. Huse et al., 1989, Science, 254.1275-1281). 

Various immunoassays can be used for screening to identify antibodies 

15 having the desired specificity. Numerous protocols for competitive binding or 
immunoradiometric assays using either polyclonal or monoclonal antibodies 
with established specificities are well known in the art. Such immunoassays 
typically involve measuring the formation of complexes between an HDAC 
polypeptide and its specific antibody. A two-site, monoclonal-based 

20 immunoassay utilizing monoclonal antibodies reactive with two non-interfering 
HDAC epitopes is preferred, but a competitive binding assay may also be 
employed (Maddox, supra). 

Antibodies which specifically bind HDAC epitopes can also be used in 
immunohistochemical staining of tissue samples to evaluate the abundance 

25 and pattern of expression of each of the provided HDAC polypeptides. Anti- 
HDAC antibodies can be used diagnostically in immuno-precipitation and 
immunoblotting techniques to detect and evaluate HDAC protein levels in 
tissue as part of a clinical testing procedure. For instance, such 
measurements can be useful in predictive evaluations of the onset or 

30 progression of proliferative or differentiation disorders. Similarly, the ability to 
monitor HDAC protein levels in an individual can allow the determination of 
the efficacy of a given treatment regimen for an individual afflicted with such a 
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disorder. The level of HDAC polypeptide may be measured from cells in a 
bodily fluid, such as in samples of cerebral spinal fluid or amniotic fluid, or can 
be measured in tissue, such as produced by biopsy. Diagnostic assays using 
anti-HDAC antibodies can include, for example, immunoassays designed to 
5 aid in early diagnosis of a disorder, particularly ones that are manifest at birth. 
Diagnostic assays using anti-HDAC polypeptide antibodies can also include 
immunoassays designed to aid in early diagnosis and phenotyping of 
neoplastic or hyperplastic disorders. 

Another application of anti-HDAC antibodies according to the present 

10 invention is in the immunological screening of cDNA libraries constructed in 
expression vectors such as Xgt1 1 , Xgt 18-23, AZAP, and AORF8. Messenger 
libraries of this type, having coding sequences inserted in the correct reading 
frame and orientation, can produce fusion proteins. For example, ?tgt1 1 will 
produce fusion proteins whose amino termini contain 13-galactosidase amino 

15 acid sequences and whose carboxy termini contain a foreign polypeptide. 
Antigenic epitopes of an HDAC protein, e.g. other orthologs of a particular 
HDAC protein or other paralogs from the same species, can then be detected 
with antibodies by, for example, reacting nitrocellulose filters lifted from 
infected plates with anti-HDAC antibodies. Positive phage detected by this 

20 assay can then be isolated from the infected plate. Thus, the presence of 
HDAC homologs can be detected and cloned from other animals, as can 
alternative isoforms (including splice variants) from humans. 
Therapeutics/T reatments/Methods of Use Involving HDACs 

In an embodiment of the present invention, the polynucleotide 

25 encoding an HDAC polypeptide or peptide, or any fragment or complement 
thereof, may be used for therapeutic purposes. In one aspect, antisense to 
the polynucleotide encoding a novel HDAC polypeptide may be used in 
situations in which it would be desirable to block the transcription of HDAC 
mRNA. In particular, cells may be transformed or transfected with sequences 

30 complementary to polynucleotides encoding an HDAC polypeptide. Thus, 
complementary molecules may be used to modulate human HDAC 
polynucleotide and polypeptide activity, or to achieve regulation of gene 
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function. Such technology is now well known in the art, and sense or 
antisense oligomers or oligonucleotides, or larger fragments, can be designed 
from various locations along the coding or control regions of polynucleotide 
sequences encoding the HDAC polypeptides. For antisense therapeutics, the 

5 oligonucleotides in accordance with this invention preferably comprise at least 
3 to 50 nucleotides of a sequence complementary to SEQ ID NO:1, SEQ ID 
NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96. It 
is more preferred that such oligonucleotides and analogs comprise at least 8 
to 25 nucleotides, and still more preferred to comprise at least 12 to 20 

0 nucleotides of this sequence. 

Expression vectors derived from retroviruses, adenovirus, herpes or 
vaccinia viruses, or from various bacterial plasmids may be used for delivery 
of nucleotide sequences to the targeted organ, tissue or cell population. 
Methods which are well known to those skilled in the art can be used to 

5 construct recombinant vectors which will express nucleic acid sequences that 
are complementary to the nucleic acid sequences encoding the novel HDAC 
polypeptides and peptides of the present invention. These techniques are 
described both in J. Sambrook et al. f supra and in F.M. Ausubel et al., supra. 
A preferred approach for in vivo introduction of nucleic acid into, a cell is 

0 by use of a viral vector containing nucleic acid, e.g. a cDNA encoding the 
particular HDAC polypeptide desired. Infection of cells with a viral vector has 
the advantage that a large proportion of the targeted cells can receive the 
nucleic acid. In addition, molecules encoded within the viral vector, e.g., by a 
cDNA contained in the viral vector, are expressed efficiently in cells that have 

5 taken up viral vector nucleic acid. As mentioned, retrovirus vectors, 
adenovirus vectors and adeno-associated virus vectors are exemplary 
recombinant gene* delivery system for the transfer of exogenous genes in 
vivo, particularly into humans. These vectors provide efficient delivery of 
genes into cells, and the transferred nucleic acids are stably integrated into 

D the chromosomal DNA of the host. 

In addition to the above-illustrated viral transfer methods, non-viral 
methods can also be employed to yield expression of an HDAC polypeptide in 
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the cells and/or tissue of an animal. Most non-viral methods of gene transfer 
rely on normal mechanisms used by mammalian cells for the uptake and 
intracellular transport of macromolecuies. In preferred embodiments, non- 
vira) gene delivery systems rely on endocytic pathways for the uptake of the 
5 novel HDAC polypeptide-encoding gene by the targeted cell. Exemplary gene 
delivery systems of this type include liposomal derived systems, poly-lysine 
conjugates, and artificial viral envelopes. 

In clinical settings, the gene delivery systems for a therapeutic HDAC 
gene can be introduced into a patient by any of a number of methods, each of 

10 which is familiar in the art. For instance, a pharmaceutical preparation of the 
gene delivery system can be introduced systematically, e.g., by intravenous 
injection, and specific transduction of the protein in the target cells occurs 
predominantly from the specificity of transfection provided by the gene 
delivery vehicle, cell-type or tissue-type expression due to the transcriptional 

15 regulatory sequences controlling expression of the receptor gene, or a 
combination thereof. 

In other aspects, the initial delivery of a recombinant HDAC gene is 
more limited, for example, with introduction into an animal being quite 
localized. For instance, the gene delivery vehicle can be introduced by 

20 catheter (see, U.S. Patent No. 5,328,470) or by stereotactic injection (e.g., 
Chen et al., 1994, Proa Natl. Acad. ScL USA, 91:3054-3057). An HDAC 
nucleic acid sequence (gene), e.g., sequences represented by SEQ ID NO:1, 
SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, and/or SEQ 
ID NO:96, or a fragment thereof, can be delivered in a gene therapy construct 

25 by electroporation using techniques described, for example, by Dev et al. 
(1994, Cancer Treat Rev., 20:105-115). 

The gene encoding an HDAC polypeptide can be turned off by 
transforming a cell or tissue with an expression vector that expresses high 
levels of an HDAC polypeptide-encoding polynucleotide, or a fragment 

30 thereof. Such constructs may be used to introduce untranslatable sense or 
antisense sequences into a cell. Even in the absence of integration into the 
DNA, such vectors may continue to transcribe RNA molecules until they are 
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disabled by endogenous nucleases. Transient expression may last for a 
month or more with a non-replicating vector, and even longer if appropriate 
replication elements are designed to be part of the vector system. 

Modifications of gene expression can be obtained by designing 
5 antisense molecules or complementary nucleic acid sequences (DNA, RNA, 
or PNA), to the control, 5', or regulatory regions of the genes encoding the 
novel HDAC polypeptides, (e.g., signal sequence, promoters, enhancers, and 
introns). Oligonucleotides derived from the transcription initiation site, e.g., 
between positions -10 and +10 from the start site, are preferable. Similarly, 

10 inhibition can be achieved using 'triple helix" base-pairing methodology. 
Triple helix pairing is useful because it causes inhibition of the ability of the 
double helix to open sufficiently for the binding of polymerases, transcription 
factors, or regulatory molecules. Recent therapeutic advances using triplex 
DNA have been described (See, for example, J.E. Gee et al., 1994, In: B.E. 

15 Huber and B.I. Carr, Molecular and Immunologic Approaches, Futura 
Publishing Co., Mt. Kisco, NY). The antisense molecule or complementary 
sequence may also be designed to block translation of mRNA by preventing 
the transcript from binding to ribosomes. 

Ribozymes, i.e., enzymatic RNA molecules, may also be used to 

20 catalyze the specific cleavage of RNA. The mechanism of ribozyme action 
involves sequence-specific hybridization of the ribozyme molecule to 
complementary target RNA, followed by endonucleolytic cleavage. Suitable 
examples include engineered hammerhead motif ribozyme molecules that can 
specifically and efficiently catalyze endonucleolytic cleavage of sequences 

25 encoding the HDAC polypeptides. 

Specific ribozyme cleavage sites within any potential RNA target are 
initially identified by scanning the target molecule for ribozyme cleavage sites 
which include the following sequences: GUA, GUU, and GUC. Once 
identified, short RNA sequences of between 15 and 20 ribonucleotides 

30 corresponding to the region of the target gene containing the cleavage site 
may be evaluated for secondary structural features which may render the 
oligonucleotide inoperable. The suitability of candidate targets may also be 
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evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes according to 
the invention may be prepared by any method known in the art for the 

5 synthesis of nucleic acid molecules. Such methods include techniques for 
chemically synthesizing oligonucleotides, for example, solid phase 
phosphoramidite chemical synthesis. Alternatively, RNA molecules may be 
generated by in vitro and in vivo transcription of DNA sequences encoding the 
human HDACs of the present invention. Such DNA sequences may be 

0 incorporated into a wide variety of vectors with suitable RNA polymerase 
promoters such as T7 or SP. Alternatively, the cDNA constructs that 
constitutively or inducibly synthesize complementary HDAC RNA can be 
introduced into cell lines, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and 

5 half-life. Possible modifications include, but are not limited to, the addition of 
flanking sequences at the 5' and/or 3' ends of the molecule, or the use of 
phosphorothioate or 2' O-methyl (rather than phosphodiesterase linkages) 
within the backbone of the molecule. This concept is inherent in the 
production of PNAs and can be extended in all of these molecules by the 

0 inclusion of nontraditional bases such as inosine, queosine, and wybutosine, 
as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, 
cytidine, guanine, thymine, and uridine which are not as easily recognized by 
endogenous endonucleases. 

Many methods for introducing vectors into cells or tissues are available 

5 and are equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo 
therapy, vectors may be introduced into stem cells taken from the patient and 
clonally propagated for autologous transplant back into that same patient. 
Delivery by transfection and by liposome injections may be achieved using 
methods that are well known in the art. 

0 In another embodiment of the present invention, an expression vector 

containing the complement of the polynucleotide encoding an HDAC 
polypeptide, or an antisense HDAC oligonucleotide, may be administered to 
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an individual to treat or prevent a disease or disorder associated with 
uncontrolled or neoplastic cell growth, hyperactivity or stimulation, for 
example. A variety of specialized oligonucleotide delivery techniques may be 
employed, for example, encapsulation in unilamellar liposomes and 
5 reconstituted Sendai virus envelopes for RNA and DNA delivery (Arad et al., 
1986, Biochem. Biophys. Acta., 859:88-94). 

In another embodiment, the proteins, antagonists, antibodies, agonists, 
complementary sequences, or vectors of the present invention can be 
administered in combination with other appropriate therapeutic agents. 

10 Selection of the appropriate agents for use in combination therapy may be 
made by one of ordinary skill in the art, according to conventional 
pharmaceutical principles. The combination of therapeutic agents may act 
synergistically to effect the treatment or prevention of the various disorders 
described above. Using this approach, one may be able to achieve 

15 therapeutic efficacy with lower dosages of each agent, thus reducing the 
potential for adverse side effects. 

Any of the therapeutic methods described above may be applied to any 
individual in need of such therapy, including, for example, mammals such as 
dogs, cats, cows, horses, rabbits, monkeys, and most preferably, humans. 

20 Another aspect of the present invention involves a method for 

modulating one or more of growth, differentiation, or survival of a mammalian 
cell by modulating HDAC bioactivity, e.g., by inhibiting the deacetylase activity 
of HDAC proteins, or disrupting certain protein-protein interactions. In 
general, whether carried out in vivo, in vitro, ex vivo, or in situ, the method 

25 comprises treating a cell with an effective amount of an HDAC therapeutic so 
as to alter, relative to an effect in the absence of treatment, one or more of (i) 
rate of growth or proliferation, (ii) differentiation, or (iii) survival of the cell. 
Accordingly, the method can be carried out with HDAC therapeutics, such as 
peptide and peptidomimetics, or other molecules identified in the drug 

30 screening methods as described herein which antagonize the effects of a 
naturally-occurring HDAC protein on a cell. 
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Other HDAC therapeutics include antisense constructs for inhibiting 
expression of HDAC proteins, and dominant negative mutants of HDAC 
proteins which competitively inhibit protein-substrate and/or protein- protein 
interactions upstream and downstream of the wild-type HDAC protein. In an 
5 exemplary embodiment, an antisense method is used to treat tumor cells by 
antagonizing HDAC activity and blocking cell cycle progression. The method 
includes, but is not limited to, the treatment of testicular cells, so as modulate 
spermatogenesis; the modulation of osteogenesis or chondrogenesis, 
comprising the treatment of osteogenic cells or chondrogenic cell, 

10 respectively, with an HDAC polypeptide. In addition, HDAC polypeptides can 
be used to modulate the differentiation of progenitor cells, e.g., the method 
can be used to cause differentiation of hematopoietic cells, neuronal cells, or 
other stem/progenitor cell populations, to maintain a cell in a differentiated 
state, and/or to enhance the survival of a differentiated cell, e.g., to prevent 

1 5 apoptosis or other forms of cell death. 

The present method is applicable, for example, to cell culture 
techniques, such as in the culturing of hematopoietic cells and other cells 
whose survival or differentiation state is dependent on HDAC function. 
Moreover, HDAC agonists and antagonists can be used for therapeutic 

20 intervention, such as to enhance survival and maintenance of cells, as well as 
to influence organogenic pathways, such as tissue patterning and other 
differentiation processes. As an example, such a method is practiced for 
modulating, in an animal, cell growth, cell differentiation or cell survival, and 
comprises administering a therapeutically effective amount of an HDAC 

25 polypeptide to alter, relative the absence of HDAC treatment, one or more of 
(i) rate of cell growth or proliferation, (ii) cell differentiation, and/or (iii) cell 
survival of one or more cell types in an animal. 

In another of its aspects the present invention provides a method of 
determining if a subject, e.g., a human patient, is at risk for a disorder 

30 characterized by unwanted cell proliferation or aberrant control of 
differentiation. The method includes detecting, in a tissue of the subject, the 
presence or the absence of a genetic lesion characterized by at least one of 
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(i) a mutation of a gene encoding an HDAC protein, e.g. represented in one of 
SEQ ID NO:1, SEQ ID NO: 12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID 
NO:94, or SEQ ID NO:96, or a homolog thereof, or (ii) the mis-expression of 
an HDAC gene. More specifically, detecting the genetic lesion includes 
5 ascertaining the existence of at least one of a deletion of one or more 
nucleotides from an HDAC gene; an addition of one or more nucleotides to 
the gene, a substitution of one or more nucleotides of the gene, a gross 
chromosomal rearrangement of the gene; an alteration in the level of a 
messenger RNA transcript of the gene; the presence of a non-wild type 
10 splicing pattern of an mRNA transcript of the gene; or a non-wild type level of 
the protein. 

For example, detecting a genetic lesion can include (i) providing a 
probe/primer including an oligonucleotide containing a region of nucleotide 
sequence which hybridizes to a sense or antisense sequence of an HDAC 

15 gene, e.g., a nucleic acid represented in one of SEQ ID NO:1 , SEQ ID NO:12, 
SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96, or naturally 
occurring mutants thereof, or 5' or 3' flanking sequences naturally associated 
with the HDAC gene; (ii) exposing the probe/primer to nucleic acid of the 
tissue; and (iii) detecting, by hybridization of the probe/primer to the nucleic 

20 acid, the presence or absence of the genetic lesion; e.g., wherein detecting 
the lesion comprises utilizing the probe/primer to determine the nucleotide 
sequence of the HDAC gene and, optionally, of the flanking nucleic acid 
sequences. For instance, the probe/primer can be employed in a polymerase 
chain reaction (PCR) or in a ligation chain reaction (LCR). In alternative 

25 embodiments, the level of an HDAC protein is detected in an immunoassay 
using an antibody that is specifically immunoreactive with the HDAC protein. 
Methods And Therapeutic Uses Related To Cell Modulation 

Another aspect of the present invention relates to a method of inducing 
and/or maintaining a differentiated state, enhancing survival, and/or inhibiting 

30 (or alternatively, potentiating) the proliferation of a cell, by contacting cells with 
an agent that modulates HDAC-dependent transcription. In view of the 
apparently broad involvement of HDAC proteins in the control of chromatin 
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structure and, in turn, transcription and replication, the present invention 
contemplates a method for generating and/or maintaining an array of different 
tissue both in vitro and in vivo. An n HDAC therapeutic," whether inhibitory or 
potentiating with respect to modulating histone deacetylation, can be, as 
5 appropriate, any of the preparations described herein, including isolated 
polypeptides, gene therapy constructs, antisense molecules, peptidomimetics, 
or agents identified in the drug and bioactive screening assays and methods 
described herein. 

As an aspect of the present invention, the HDAC modulatory (i.e., 

10 inhibitory or stimulatory) compounds are likely to play an important role in 
effecting cellular proliferation. There are a wide variety of pathological cell 
proliferative conditions for which HDAC therapeutic agents of the present 
invention may be used in treatment. For instance, such agents can provide 
therapeutic benefits in the inhibition of an anomalous cell proliferation. 

15 Nonlimiting examples of diseases and conditions that may benefit from such 
methods include various cancers and leukemias, psoriasis, bone diseases, 
fibroproliferative disorders, e.g., those involving connective tissues, 
atherosclerosis and other smooth muscle proliferative disorders, as well as 
chronic inflammation. 

20 Non-limiting cancer types include carcinoma (e.g., adenocarcinoma), 

sarcoma, myeloma, leukemia, and lymphoma, and mixed types of cancers, 
such as adenosquamous carcinoma, mixed mesodermal tumor, 
carcinosarcoma, and teratocarcinoma. Representative cancers include, but 
are not limited to, bladder cancer, lung cancer, breast cancer, colon cancer, 

25 rectal cancer, endometrial cancer, ovarian cancer, head and neck cancer, 
prostate cancer, and melanoma. Specifically included are AIDS-related 
cancers (e.g., Kaposi's Sarcoma, AIDS-related lymphoma), bone cancers 
(e.g., osteosarcoma, malignant fibrous histiocytoma of bone, Ewing's 
Sarcoma, and related cancers), and hematologic/blood cancers (e.g., adult 

30 acute lymphoblastic leukemia, childhood acute lymphoblastic leukemia, adult 
acute myeloid leukemia, childhood acute myeloid leukemia, chronic 
lymphocytic leukemia, chronic myelogenous leukemia, hairy cell leukemia, 
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cutaneous T-cell lymphoma, adult Hodgkin's disease, childhood Hodgkin's 
disease, Hodgkin's disease during pregnancy, mycosis fungoides, adult non~ 
Hodgkin's lymphoma, childhood non-Hodgkin's lymphoma, non-Hodgkin's 
lymphoma during pregnancy, primary central nervous system lymphoma, 
5 Sezary syndrome, cutaneous T-cell lymphoma, Waldenstrom's 
macroglobulinemia, multiple myeloma/plasma cell neoplasm, myelodysplastic 
syndrome, and myeloproliferative disorders). 

Also included are brain cancers (e.g., adult brain tumor, childhood 
brain stem glioma, childhood cerebellar astrocytoma, childhood cerebral 

10 astrocytoma, childhood ependymoma, childhood medulloblastoma, 
supratentorial primitive neuroectodermal and pineal, and childhood visual 
pathway and hypothalamic glioma), digestive/gastrointestinal cancers (e.g., 
anal cancer, extrahepatic bile duct cancer, gastrointestinal carcinoid tumor, 
colon cancer, esophageal cancer, gallbladder cancer, adult primary liver 

15 cancer, childhood liver cancer, pancreatic cancer, rectal cancer, small 
intestine cancer, and gastric cancer), musculoskeletal cancers (e.g., 
childhood rhabdomyosarcoma, adult soft tissue sarcoma, childhood soft 
tissue sarcoma, and uterine sarcoma), and endocrine cancers (e.g., 
adrenocortical carcinoma, gastrointestinal carcinoid tumor, islet cell carcinoma 

20 (endocrine pancreas), parathyroid cancer, pheochromocytoma, pituitary 
tumor, and thyroid cancer). 

Further included are neurologic cancers (e.g., neuroblastoma, pituitary 
tumor, and primary central nervous system lymphoma), eye cancers (e.g., 
intraocular melanoma and retinoblastoma), genitourinary cancers (e.g., 

25 bladder cancer, kidney (renal cell) cancer, penile cancer, transitional cell renal 
pelvis and ureter cancer, testicular cancer, urethral cancer, Wilms 1 tumor and 
other childhood kidney tumors), respiratory/thoracic cancers (e.g., non-small 
cell lung cancer, small cell lung cancer, malignant mesothelioma, and 
malignant thymoma), germ cell cancers (e.g., childhood extracranial germ cell 

30 tumor and extragonadal germ cell tumor), skin cancers (e.g., melanoma, and 
merkel cell carcinoma), gynecologic cancers (e.g., cervical cancer, 
endometrial cancer, gestational trophoblastic tumor, ovarian epithelial cancer, 
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ovarian germ cell tumor, ovarian low malignant potential tumor, uterine 
sarcoma, vaginal cancer, and vulvar cancer), and unknown primary cancers. 

In certain aspects of the inventions, the disclosed HDAC inhibitors, 
antisense molecules, anti-HDAC antibodies, or antibody fragments can be 
5 used as treatments for breast or prostate cancers. In particular, HDAC9c 
inhibitors, HDAC9c antisense molecules, anti-HDAC9c antibodies, or 
fragments thereof, can be used. Specific breast cancers include, but are not 
limited to, non-invasive cancers, such as ductal carcinoma in situ (DCIS), 
intraductal carcinoma lobular carcinoma in situ (LCIS), papillary carcinoma, 

10 and comedocarcinoma, or invasive cancers, such as adenocarcinomas, or 
carcinomas, e.g., infiltrating ductal carcinoma, infiltrating lobular carcinoma, 
infiltrating ductal and lobular carcinoma, medullary carcinoma, mucinous 
(colloid) carcinoma, comedocarcinoma, Paget's Disease, papillary carcinoma, 
tubular carcinoma, and inflammatory carcinoma. Specific prostate cancers 

15 may include adenocarcinomas and sarcomas, or pre-cancerous conditions, 
such as prostate intraepithelial neoplasia (PIN). 

In addition to proliferative disorders, the present invention envisions the 
use of HDAC therapeutics for the treatment of differentiation disorders 
resulting from, for example, de-differentiation of tissue which may (optionally) 

20 be accompanied by abortive reentry into mitosis, e.g. apoptosis. Such 
degenerative disorders include chronic neurodegenerative diseases of the 
nervous system, including Alzheimer's disease, Parkinson's disease, 
Huntington's chorea, amyotrophia lateral sclerosis (ALS) and the like, as well 
as spinocerebellar degenerations. Other differentiation disorders include, for 

25 example, disorders associated with connective tissue, such as can occur due 
to de-differentiation of chondrocytes or osteocytes, as well as vascular 
disorders which involve de-differentiation of endothelial tissue and smooth 
muscle cells, gastric ulcers characterized by degenerative changes in 
glandular cells, and renal conditions marked by failure to differentiate, e.g. 

30 Wilm's tumors. 

It will also be recognized that, by transient use of modulators of HDAC 
activities, in vivo reformation of tissue can be accomplished, for example, in 
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the development and maintenance of organs. By controlling the proliferative 
and differentiation potential for different cell types, HDAC therapeutics can be 
used to re-form injured tissue, or to improve grafting and morphology of 
transplanted tissue. As an example, HDAC antagonists and agonists can be 
5 employed in a differential manner to regulate different stages of organ repair 
after physical, chemical or pathological insult or injury. Such regimens can be 
utilized, for example, in the repair of cartilage, increasing bone density, liver 
repair subsequent to a partial hepatectomy, or to promote regeneration of 
lung tissue in the treatment of emphysema. 

10 The present method is also applicable to cell culture techniques. 

More specifically, HDAC therapeutics can be used to induce 
differentiation of uncommitted progenitor cells, thus giving rise to a committed 
progenitor cell, or causing further restriction of the developmental fate of a 
committed progenitor cell toward becoming a terminally differentiated cell. As 

1 5 an example, methods involving HDAC therapeutics can be used in vitro, ex 
vivo, or in vivo to induce and/or to maintain the differentiation of hematopoietic 
cells into erythrocytes and other cells of the hematopoietic cell lineage. 
Illustratively, the effect of erythropoietin (EPO) on the growth of EPO- 
responsive erythroid precursor cells is increased to influence their 

20 differentiation into red blood cells. Also, as an example, the amount of EPO, 
or other differentiating agent, that is required for growth and/or differentiation 
is reduced based on the administration of an inhibitor of histone deacetylation. 
(PCT/US92/07737). 

Accordingly, HDAC therapeutics as described, particularly those that 

25 antagonize HDAC deacetylase activity, can be administered alone or in 
conjunction with EPO, for example, in a suitable carrier, to vertebrates to 
promote erythropoiesis. Alternatively, ex vivo cell treatments are suitable. 
Similar types of treatments can be used for a variety of disease states, 
including use in individuals who require bone marrow transplants (e.g., 

30 patients with aplastic anemia, acute leukemias, recurrent lymphomas, or solid 
tumors). As an example, prior to receiving a bone marrow transplant, a 
recipient is prepared by ablating or removing endogenous hematopoietic stem 
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cells. Such treatment is typically performed by total body irradiation, or by 
delivery of a high dose of an alkylating agent or other chemotherapeutic 
cytotoxic agent (Anklesaria et ah, 1987, Proc. Natl. Acad. Sci. USA), 84:7681- 
7685). Following the preparation of the recipient, donor bone marrow cells 
5 are injected intravenously. Optionally, HDAC therapeutics could be contacted 
with the cells ex vivo or administered to the subject with the re-implanted 
cells. 

In addition, there may be cell-type specific HDAC proteins, and/or 
some cell types may be more sensitive to the modulation of HDAC 

10 deacetylase activities. Even within a cell type, the stage of differentiation or 
position in the cell cycle could influence a cell's response to a modulatory 
HDAC therapeutic agent. Accordingly, the present invention contemplates the 
use of agents that modulate histone deacetylase activity to specifically inhibit 
or activate certain cell types. As an illustrative example, T cell proliferation 

15 could be preferentially inhibited so as to induce tolerance by a procedure 
similar to that used to induce tolerance using sodium butyrate (see, for 
example, PCT/US93/03045). Accordingly, HDAC therapeutics may be used 
to induce antigen specific tolerance in any situation in which it is desirable to 
induce tolerance, such as autoimmune diseases, in allogeneic or xenogeneic 

20 transplant recipients, or in graft versus host (GVH) reactions. Tolerance is 
typically induced by presenting the tolerizing compound (e.g., an HDAC 
inhibitor compound) substantially concurrently with the antigen, i.e., within a 
time period that is reasonably close to that in which the antigen is 
administered. Preferably, the HDAC therapeutic is administered after 

25 presentation of the antigen, so that the cumulative effect will occur after the 
particular repertoire of Th cells begins to undergo clonal expansion. 
Additionally, the present invention contemplates the application of HDAC 
therapeutics for modulating morphogenic signals involved in organogenic 
pathways. Thus, it is apparent that compositions comprising HDAC 

30 therapeutics can be employed for both cell culture and therapeutic methods 
involving the generation and maintenance of tissue. 
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In a further aspect, HDAC therapeutics are useful in increasing the 
amount of protein produced by a cell, including a recombinant cell. Suitable 
cells may comprise any primary cell isolated from any animal, cultured cells, 
immortalized cells, transfected or transformed cells, and established cell lines. 

5 Animal cells preferably will include cells which intrinsically have an ability to 
produce a desired protein; cells which are induced to have an ability to 
produce a desired protein, for example, by stimulation with a cytokine such as 
an interferon or an interleukin; genetically engineered cells into which a gene 
encoding a desired protein is introduced. The protein produced by the 

0 process can include peptides or proteins, including peptide-hormone or 
proteinaceous hormones such as any useful hormone, cytokine, interleukin, or 
protein which it may be desirable to be produced in purified form and/or in 
large quantity. 

In specific aspects, the HDAC inhibitors, antisense molecules, anti- 
5 HDAC antibodies, or antibody fragments of the invention can be used in 
combination with other HDAC inhibitory agents, e.g., trichostatin A (D.M. 
Vigushin et al., 2001, Clin. Cancer Res. 7(4):971-6); trapoxin A (Itazaki et al., 
1990, J. Antibiot 43:1524-1532), MS-275 (T. Suzuki et al., 1999, J. Med. 
Chem. 42(15):3001-3), CHAPs (Y. Komatsu et al., 2001, Cancer Res. 
0 61(11):4459-66), CI-994 (see, e.g., P.M. LoRusso et al., 1996, New Drugs 
' 14(4):349-56), SAHA (V.M. Richon et al., 2001, Blood Cells Mol. Dis. 
27(1)1260-4), depsipeptide (FR901228; FK228; V. Sandor et al., 2002, Clin. 
Cancer Res. 8(3):718-28), CBHA (D.C. Coffey et al., 2001, Cancer Res. 
61(9):3591-4), pyroxamide, (L.M. Butler et al, 2001, Clin. Cancer Res. 
5 7(4):962-70), CHAP31 (Y. Komatsu et al., 2001, Cancer Res. 61(11):4459- 
66), HC-toxin (Liesch et al., 1982, Tetrahedron 38:45-48), chlamydocin 
(Closse et al., 1974, Helv. Chim. Acta 57:533-545), Cly-2 (Hirota et al., 1973, 
Agri. Biol. Chem. 37:955-56), WF-3161 (Umehana et al., 1983, J. Antibiot. 36, 
478-483; M. Kawai et al., 1986, J. Med. Chem. 29(11):2409-1 1), Tan-1746 
D (Japanese Patent No. 7196686 to Takeda Yakuhin Kogyo KK), apicidin (S.H. 
Kwon et al., 2002, J. Biol. Chem. 277(3):2073-80), and analogs thereof. 
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Screening Methods 

The novel HDAC proteins, peptides and nucleic acids can be used in 
screening assays to identify candidate bioactive agents or drugs that 
modulate HDAC bioactivity, preferably HDAC inhibitors, for potential use to 

5 treat neoplastic disorders, for example, to kill cancer cells and tumor cells 
exhibiting uncontrolled cell growth for numerous reasons, e.g., the lack of a 
suppressor molecule such as p53. In addition, HDAC proteins and encoding 
nucleic acids, as well as the bioactive agents that modulate HDAC activity or 
function, can be used as effectors in methods to regulate cell growth, e.g., to 

0 kill neoplastic cells. 

The HDAC polynucleotides and polypeptides can also be modulated by 
interactive molecules. By "modulate" herein is meant that the bioactivity of 
HDAC is altered, i.e., either increased or decreased. In a preferred 
embodiment, HDAC function is inhibited. The HDACs can be used as targets 

5 to screen for inhibitors of HDAC, e.g., naturally-occurring HDAC, function, 
bioactivity, or expression in neoplastic cells and/or uncontrolled cell growth. 
Examples of HDAC biological activity include the ability to modulate the 
proliferation of cells. For example, inhibiting histone deacetylation causes 
cells to arrest in the G1 and G2 phases of the cell cycle. The biochemical 

0 activity associated with the novel HDAC proteins of the present invention are 
also characterized in terms of binding to and (optionally) catalyzing the 
deacetylation of an acetylated histone. Another biochemical property of 
certain HDAC proteins involves binding to other cellular proteins, such as 
RbAp48 (Qian et al., 1993, Nature, 364:648), or Sin3A. (see, e.g., WO 

5 97/35990) 

Generally, in performing screening methods, HDAC polypeptide or 
peptide can be non-diffusably bound to an insoluble support having isolated 
sample receiving areas (e.g. a microtiter plate, an array, etc.). The criteria for 
suitable insoluble supports are that they can be made of any composition to 
0 which polypeptides can be bound; they are readily separated from soluble 
material; and they are otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any 
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convenient size or shape. Examples of suitable insoluble supports include 
microtiter plates, arrays, membranes and beads. These are typically made of 
glass, plastic (e.g., polystyrene), polysaccharides, nylon or nitrocellulose. 
Microtiter plates and arrays are especially convenient, because a large 
5 number of assays can be carried out simultaneously, using small amounts of 
reagents and samples. The particular manner of binding the polypeptide is 
not crucial, so long as it is compatible with the reagents and overall methods 
of the invention, maintains the activity of the peptide and is nondiff usable. 

Preferred methods of binding include the use of antibodies (which 

1 0 should not hinder the binding of HDACs to associated proteins), direct binding 
to "sticky" or ionic supports, chemical crosslinking, etc. Following binding of 
the polypeptide, excess unbound material is removed by washing. The 
sample receiving areas may then be blocked as needed through incubation 
with bovine serum albumin (BSA), casein or other innocuous/nonreactive 

15 protein. 

A candidate bioactive agent is added to the assay. Novel binding 
agents include specific antibodies, non-natural binding agents identified in 
screens of chemical libraries, peptide analogs, etc. Of particular interest are 
screening assays for agents that have a low toxicity for human cells. A wide 

20 variety of assays may be used for this purpose, including labeled in vitro 
protein-protein binding assays, electrophoretic mobility shift assays, 
immunoassays for protein binding, and the like. The term "agent" as used 
herein describes any molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., having the capability of directly 

25 or indirectly altering the activity or function of HDAC polypeptides. Generally 
a plurality of assay mixtures are run in parallel with different agent 
concentrations to obtain a differential response to the various concentrations. 
Typically, one of these concentrations serves as a negative control, i.e., at 
zero concentration, or below the level of detection. 

30 Candidate agents encompass numerous chemical classes, though 

typically they are organic molecules, preferably small organic compounds 
having a molecular weight of more than 100 and less than about 10,000 
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daltons, preferably, less than about 2000 to 5000 daltons, as a nonlimiting 
example. Candidate agents comprise functional groups necessary for 
structural interaction with proteins, particularly hydrogen bonding, and 
typically include at least an amine, carbonyl, hydroxyl or carboxyl group, 

5 preferably at least two of the functional chemical groups. The candidate 
agents often comprise cyclical carbon or heterocyclic structures and/or 
aromatic or polyaromatic structures substituted with one or more of the above 
functional groups. Candidate agents are also found among biomolecules 
including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, 

0 derivatives, structural analogs or combinations thereof. 

Candidate agents are obtained from a wide variety of sources including 
libraries of synthetic or natural compounds. For example, numerous means 
are available for random and directed synthesis of a wide variety of organic 
compounds and biomolecules, including expression of randomized 

5 oligonucleotides. Alternatively, libraries of natural compounds in the form of 
bacterial, fungal, plant and animal extracts are available or readily produced. 
In addition, natural or synthetically produced libraries and compounds are 
readily modified through conventional chemical, physical and biochemical 
means. Known pharmacological agents may be subjected to directed or 

0 random chemical modifications, such as acylation, alkylation, esterification, 
amidification to produce structural analogs. 

The determination of the binding of the candidate biomolecule or agent 
to an HDAC polypeptide may be accomplished in a number of ways practiced 
in the art. In one aspect, the candidate bioactive agent is labeled, and binding 

5 is determined directly. Where the screening assay is a binding assay, one or 
more of the molecules may be joined to a label, where the label can directly or 
indirectly provide a detectable signal. Various labels include radioisotopes, 
enzymes, fluorescent and chemiluminescent compounds, specific binding 
molecules, particles, e.g. magnetic particles, and the like. Specific binding 

3 molecules include pairs, such as biotin and streptavidin, digoxin and 
antidigoxin etc. For the specific binding members, the complementary 
member would normally be labeled with a molecule which allows detection, in 
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accordance with known procedures. In some embodiments, only one of the 
components is labeled. Alternatively, more than one component may be 
labeled with different labels; for example, the HDAC polypeptide may be 
labeled with one fluorophor and the candidate agent labeled with another 
5 In one embodiment, the candidate bioactive agent is labeled. Labeled 

candidate bioactive agents are incubated with an HDAC polypeptide for a time 
sufficient to allow binding, if present. Incubations may be performed at any 
temperature which facilitates optimal activity, typically between 4 Q C and 40 9 C. 
Incubation periods are selected for optimum activity, but may also be 

10 optimized to facilitate rapid high throughput screening. Typically between 0.1 
and 1 hour is sufficient. Excess reagent is generally removed or washed 
away. The presence or absence of the labeled component is detected to 
determine and indicate binding. 

A variety of other reagents may be included in the screening assay. 

15 Such reagents include, but are not limited to, salts, neutral proteins, e.g. 
albumin, detergents, etc., which may be used to facilitate optimal protein- 
protein binding and/or to reduce non-specific or background interactions. In 
addition, reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc. may be 

20 used. Further, the mixture of components in the method may be added in any 
order that provides for the requisite binding. 

Kits are included as an embodiment of the present invention which 
comprise containers with reagents necessary to screen test compounds. 
Depending on the design of the test and the types of compounds to be 

25 screened, such kits include human HDAC polynucleotide, polypeptide, or 
peptide and instructions for performing the assay. 

Inhibitors of the enzymatic activity of each of the novel HDAC 
polypeptides can be identified using assays which measure the ability of an 
agent to inhibit catalytic conversion of a substrate by the HDAC proteins 

30 provided by the present invention. For example, the ability of the novel HDAC 
proteins to deacetylate a histone substrate, such as histone H4, in the 
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presence and absence of a candidate inhibitor, can be determined using 
standard enzymatic assays. 

A number of methods have been employed in the art for assaying 
histone deacetyiase activity, and can be incorporated in the drug screening 
5 assays of the present invention. Preferably, the assay method will employ a 
labeled acetyl group linked to appropriate histone lysine residues as 
substrates. In other embodiments, a histone substrate peptide can be labeled 
with a group whose signal is dependent on the simultaneous presence or 
absence of an acetyl group, e.g., the label can be a fluorogenic group whose 

10 fluorescence is modulated (either quenched or potentiated) by the presence 
of the acetyl moiety. 

Using standard enzymatic analysis, the ability of a test agent (i.e., test 
compound) to cause a statistically significant change in substrate conversion 
by a histone deacetyiase can be measured, and as desirable, inhibition 

15 constants, e.g., Kj values, can be calculated. The histone substrate can be 
provided as a purified or semi-purified polypeptide or as part of a cell lysate. 
Likewise, the histone deacetyiase can be provided to a reaction mixture as a 
purified or semi-purified polypeptide, or as a cell lysate. Accordingly, the 
reaction mixtures can range from reconstituted protein mixtures derived with 

20 purified preparations of histones and deacetylases, to mixtures of cell lysates, 
e.g., by admixing baculovirus lysates containing recombinant histones and 
deacetylases. 

As an example, the histone substrate for assays described herein can 
be provided by isolation of radiolabeled histones from metabolically labeled 

25 cells. Cells such as HeLa cells can be labeled in culture by the addition of 
[ 3 H]acetate (New England Nuclear) to the culture media. (Hay et al., 1983, J. 
Biol. Chem., 258:3726-3734). The addition of an HDAC inhibitor, such as 
butyrate, trapoxin and the like, can be used to increase the abundance of 
acetylated histones in the cells. Radiolabeled histones can be isolated from 

30 the cells by extraction with H 2 S0 4 (Marushige et al., 1966, J. Mol. Biol., 
15:160-174). Briefly, cells are homogenized in buffer, centrifuged to isolate a 
nuclear pellet, and the subsequently homogenized nuclear pellet is 
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centrifuged through sucrose. The resulting chromatin pellet extracted by 
addition of H 2 S0 4 to yield [ 3 H]acetyl-labeled histones. Alternatively, 
nucleosome preparations containing [ 3 H]acetyl-labeled histones can be 
isolated from metabolically labeled cells. As known in the art, nucleosomes 
5 can be isolated from cell preparations by sucrose gradient centrifugation (e.g., 
Hay et al., 1983, J. Biol Chem., 258:3726-3734 and Noll, 1967, Nature, 
215:360-363), and polynucleosomes can be prepared by NaCI precipitation 
from micrococcal nuclease digested cells (Hay et al., supra). 

Similar procedures for isolating labeled histones from other cells types, 

10 including yeast, have been described. (See for example, Alonso et al., 1986, 
Biochem Biophys Acta, 866:161-169 and Kreiger et al, 1974, J. Biol. Chem., 
249:332 334). Also, histones are generated by recombinant gene expression, 
and include an exogenous tag (e.g., an HA epitope, a poly(his) sequence, and 
the like) which facilitates purification from cell extracts. Further, whole nuclei 

15 can be isolated from metabolically labeled cells by micrococcal nuclease 
digestion (Hay et al., supra). 

The deacetylase substrate can also be provided as an acetylated 
peptide including a sequence corresponding to the sequence around the 
specific lysyl residues acetylated on histones, e.g., peptidyl portions of the 

20 core histones H2A, H2B, H3, or H4. Such fragments can be produced by 
cleavage of acetylated histones derived from metabolically labeled cells, e.g., 
by treatment with proteolytic enzymes or cyanogen bromide (Kreiger et al., 
supra). The acetylated peptide can also be provided by standard solid phase 
synthesis using acetylated lysine residues (Id.). 

25 The activity of a histone deacetylase in assay detection methods 

involving use of [ 3 H]acetyl-labeled histones is detected by measuring the 
release of [ 3 H]acetate by standard scintillation techniques. As an illustrative 
example, a reaction mixture is provided which contains a recombinant HDAC 
protein suspended in buffer, along with a sample of [ 3 H]acetyl-labeled 

30 histones and (optionally) a test compound. The reaction mixture is 
maintained at a desired temperature and pK such as 22°C at pH 7.8, for 
several hours, and the reaction is terminated by boiling, or another form of 
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denaturation. Released [ 3 H]acetate is extracted and counted. For example, 
the quenched reaction mixture can be acidified with concentrated HCI and 
used to create a biphasic mixture with ethyl acetate. The resulting two-phase 
system is thoroughly mixed, centrifuged, and the ethyl acetate phase 
5 collected and counted by standard scintillation methods. Other methods for 
detecting acetate release will be easily recognized by those having skill in the 
art. 

In yet another aspect, the drug screening assay is designed to include 
a reagent cell recombinantly expressing one or more of a target protein or 

0 HDAC protein. The ability of a test agent to alter the activity of the HDAC 
protein can be detected by analysis of the recombinant cell. For instance, 
agonists and antagonists of the HDAC biological activity can by detected by 
scoring for alterations in growth or differentiation (phenotype) of the cell. 
General techniques for detecting these characteristics are well known, and 

5 will vary with respect to the source of the particular reagent cell utilized in any 
given assay. For example, quantification of cell proliferation in the presence 
and absence of a candidate agent can be measured by using a number of 
techniques well known in the art, including simple measurement of population 
growth curves. 

0 Where an assay involves proliferation in a liquid medium, turbidimetric 

techniques (i.e. absorbance/transmittance of light of a given wavelength 
through the sample) can be utilized. For example, in a case in which the 
reagent cell is a yeast cell, measurement of absorbance of light at a 
wavelength at between 540 and 600 nm can provide a conveniently fast 

5 measure of cell growth. Moreover, the ability of yeast cells to form colonies in 
solid medium (e.g. agar) can be used to readily score for proliferation. In 
other embodiments, an HDAC substrate protein, such as a histone, can be 
provided as a fusion protein which permits the substrate to be isolated from 
cell lysates and the degree of acetylation detected. Each of these techniques 

0 is suitable for high throughput analysis necessary for rapid screening of large 
numbers of candidate HDAC modulatory agents. 
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In addition, in assays in which the ability of an agent to cause or 
reverse a transformed phenotype is being determined, cell growth in solid or 
semi-solid medium, such as agar, can further aid in establishing whether a 
mammalian cell is transformed. Visual inspection of the morphology of the 
5 reagent cell can also be used to determine whether the biological activity of 
the targeted HDAC protein has been affected by the added agent. By 
illustration, the ability of an agent to influence an apoptotic phenotype which is 
mediated in some way by a recombinant HDAC protein can be assessed by 
visual microscopy. Similarly, the formation of certain cellular structures as 

10 part of normal cell differentiation, such as the formation of neuritic processes, 
can be visualized under a light microscope. 

The nature of the effect of a test agent on a reagent cell can be 
assessed by measuring levels of expression of specific genes, e.g., by 
reverse transcription PCR. Another method of scoring for an effect on HDAC 

15 activity is by detecting cell-type specific marker expression through 
immunofluorescent staining. Many such markers are known in the art for 
which antibodies are readily available. For example, the presence of 
chondroitin sulfate proteoglycans, as well as type-ll collagen, is correlated 
with cartilage production in chondrocytes, and each can be detected by 

20 immunostaining. Similarly, the human kidney differentiation antigen gp160, 
human aminopeptidase A, is a marker of kidney induction, and the 
cytoskeletal protein troponin I is a marker of heart induction. 

Also, the alteration of expression of a reporter gene construct provided 
in the reagent cell provides a means of detecting an effect on HDAC activity. 

25 For example, reporter gene constructs designed using transcriptional 
regulatory sequences, e.g. the promoters, for developmental^ regulated 
genes can be used to drive the expression of a detectable marker, such as a 
luciferase gene. For example, the construct can be prepared using the 
promoter sequence from a gene expressed in a particular differentiation 

30 phenotype. 
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Pharmaceutical Compositions 

A further embodiment of the present invention embraces the 
administration of a pharmaceutical composition, in conjunction with a 
pharmaceutically acceptable carrier, diluent, or excipient, for any of the 

5 above-described therapeutic uses and effects. Such pharmaceutical 
compositions may comprise HDAC nucleic acid, polypeptide, or peptides, 
antibodies to HDAC polypeptides or peptides, or fragments thereof, mimetics, 
agonists (e.g., activators), antagonists (e.g., inhibitors, blockers) of the HDAC 
polypeptide, peptide, or polynucleotide. The compositions may be 

0 administered alone or in combination with at least one other agent, such as a 
stabilizing compound, which may be administered in any sterile, 
biocompatible pharmaceutical (or physiologically compatible) carrier, 
including, but not limited to, saline, buffered saline, dextrose, and water. The 
compositions may be administered to a patient alone, or in combination with 

5 other agents, drugs, hormones, or biological response modifiers. Preferred 
are compositions comprising one or more HDAC inhibitors. 

The pharmaceutical compositions for use in the present invention can 
be administered by any number of routes including, but not limited to, 
parenteral oral, intravenous, intramuscular, intra-arterial, intramedullary, 

0 intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, 
intranasal, ophthalmic, enteral, topical, sublingual, vaginal, or rectal means. 

Transdermal patches have the added advantage of providing controlled 
delivery of a compound of the present invention to the body. Such dosage 
forms can be made by dissolving or dispersing a deacetylase inhibitor in the 

5 proper medium. Absorption enhancers can also be used to increase the flux 
of the deacetylase inhibitor across the skin. The rate of such flux can be 
controlled by either providing a rate controlling membrane or dispersing the 
deacetylase inhibitor in a polymer matrix or gel. 

Ophthalmic formulations, eye ointments, powders, solutions and the 

0 like, are also contemplated as being within the scope of this invention. 

In addition to the active ingredients (i.e., an HDAC antagonist 
compound), the pharmaceutical compositions may contain suitable 
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pharmaceutical^ acceptable carriers or excipients comprising auxiliaries 
which facilitate processing of the active compounds into preparations that can 
be used pharmaceutical ly. Further details on techniques for formulation and 
administration are provided in the latest edition of Remington's 
5 Pharmaceutical Sciences (Maack Publishing Co:, Easton, Pa.). 

Pharmaceutical compositions for oral administration can be formulated 
using pharmaceutical^ acceptable carriers well known in the art in dosages 
suitable for oral administration. Such carriers enable the pharmaceutical 
compositions to be formulated as tablets, pills, dragees, capsules, liquids, 

10 gels, syrups, slurries, suspensions, and the like, for ingestion by the patient. 

Pharmaceutical preparations for oral use can be obtained by the 
combination of active compounds with solid excipient, optionally grinding a 
resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 

15 excipients are carbohydrate or protein fillers, such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; starch from corn, wheat, rice, potato, or 
other plants; cellulose, such as methyl cellulose, hydroxypropyl- 
methylcellulose, or sodium carboxymethylcellulose; gums, including arabic 
and tragacanth, and proteins such as gelatin and collagen. If desired, 

20 disintegrating or solubilizing agents may be added, such as cross-linked 
polyvinyl pyrrolidone, agar, alginic acid, or a physiologically acceptable salt 
thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with physiologically suitable 
coatings, such as concentrated sugar solutions, which may also contain gum 

25 arabic, talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or 
titanium dioxide, lacquer solutions, and suitable organic solvents or solvent 
mixtures. Dyestuffs or pigments may be added to the tablets or dragee 
coatings for product identification, or to characterize the quantity of active 
compound, i.e., dosage. 

30 Pharmaceutical preparations which can be used orally include push-fit 

capsules made of gelatin, as well as soft, scaled capsules made of gelatin 
and a coating, such as glycerol or sorbitol. Push-fit capsules can contain 
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active ingredients mixed with a filler or binders, such as lactose or starches, 
lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In 
soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid, or liquid polyethylene glycol with or 
5 without stabilizers. 

Pharmaceutical formulations suitable for parenteral administration may 
be formulated in aqueous solutions, preferably in physiologically compatible 
buffers such as Hanks' solution, Ringer's solution, or physiologically buffered 
saline. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. In addition, suspensions of the active 
compounds may be prepared as appropriate oily injection suspensions. 
Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or 
synthetic fatty acid esters, such as ethyloleate or triglycerides, or liposomes. 
Optionally, the suspension may also contain suitable stabilizers or agents 
which increase the solubility of the compounds to allow for the preparation of 
highly concentrated solutions. 

For topical or nasal administration, penetrants or permeation agents 
that are appropriate to the particular barrier to be permeated are used in the 
formulation. Such penetrants and permeation enhancers are generally known 
in the art. 

The pharmaceutical compositions of the present invention may be 
manufactured in a manner that is known in the art, e.g., by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping, or lyophilizing processes. 

The pharmaceutical composition may be provided as a salt and can be 
formed with many acids, including but not limited to, hydrochloric, sulfuric, 
acetic, lactic, tartaric, malic, succinic, and the like. Salts tend to be more 
soluble in aqueous solvents, or other protonic solvents, than are the 
corresponding free base forms. In other cases, the preferred preparation may 
be a lyophilized powder which may contain any or all of the following: 1-50 
mM histidine, 0.1%-2% sucrose, and 2-7% mannitol, at a pH range of 4.5 to 
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5.5, combined with a buffer prior to use. After the pharmaceutical 
compositions have been prepared, they can be placed in an appropriate 
container and labeled for treatment of an indicated condition. For 
administration of an HDAC inhibitor compound, such labeling would include 
5 amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the present invention 
include compositions wherein the active ingredients are contained in an 
effective amount to achieve the intended purpose. The determination of an 
effective dose or amount is well within the capability of those skilled in the art. 

10 For any compound, the therapeutically effective dose can be estimated 
initially either in cell culture assays, e.g., using neoplastic cells, or in animal 
models, usually mice, rabbits, dogs, or pigs. The animal model may also be 
used to determine the appropriate concentration range and route of 
administration. Such information can then be used and extrapolated to 

15 determine useful doses and routes for administration in humans. 

A therapeutically effective dose refers to that amount of active 
ingredient, for example, an HDAC inhibitor or antagonist compound, 
antibodies to an HDAC polypeptide or peptide, agonists of HDAC 
polypeptides, which ameliorates, reduces, or eliminates the symptoms or the 

20 condition. Therapeutic efficacy and toxicity may be determined by standard 
pharmaceutical procedures in cell cultures or experimental animals, e.g., ED 5 o 
(the dose therapeutically effective in 50% of the population) and LD 50 (the 
dose lethal to 50% of the population). The dose ratio of toxic to therapeutic 
effects is the therapeutic index, which can be expressed as the ratio, 

25 LD 5 o/ED 50 . Pharmaceutical compositions which exhibit large therapeutic 
indices are preferred. The data obtained from cell culture assays and animal 
studies are used in determining a range of dosages for human use. Preferred 
dosage contained in a pharmaceutical composition is within a range of 
circulating concentrations that include the ED 50 with little or no toxicity. The 

30 dosage varies within this range depending upon the dosage form employed, 
sensitivity of the patient, and the route of administration. 
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The exact dosage will be determined by the practitioner, who will 
consider the factors related to the individual requiring treatment. Dosage and 
administration are adjusted to provide sufficient levels of the active moiety or 
to maintain the desired effect. Factors which may be taken into account 
5 include the severity of the individual's disease state, general health of the 
patient, age, weight, and gender of the patient, diet, time and frequency of 
administration, drug combination(s), reaction sensitivities, and 
tolerance/response to therapy. As a general guide, long-acting 
pharmaceutical compositions may be administered every 3 to 4 days, every 

1 0 week, or once every two weeks, depending on half-life and clearance rate of 
the particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms 
(|xg), up to a total dose of about 1 gram (g), depending upon the route of 
administration. Guidance as to particular dosages and methods of delivery is 

15 provided in the literature and is generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than 
for proteins or their inhibitors. Similarly, delivery of polynucleotides or 
polypeptides will be specific to particular cells, conditions, locations, and the 
like. 

20 Assays and Diagnostics 

In another embodiment of the present invention, antibodies which 
specifically bind to the HDAC polypeptides or peptides of the present 
invention may be used for the diagnosis of conditions or diseases 
characterized by expression (or overexpression) of an HDAC polynucleotide 

25 or polypeptide, or in assays to monitor patients being treated modulatory 
compounds of HDAC polypeptides, or, for example, HDAC antagonists or 
inhibitors. The antibodies useful for diagnostic purposes may be prepared in 
the same manner as those described above for use in therapeutic methods. 
Diagnostic assays for the HDAC polypeptides include methods which utilize 

30 the antibody and a label to detect the protein in human body fluids or extracts 
of cells or tissues. The antibodies may be used with or without modification, 
and may be labeled by joining them, either covalently or non-covalently, with a 
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reporter molecule. A wide variety of reporter molecules which are known in 
the art may be used, several of which are described above. 

Several assay protocols including ELISA, RIA, and FACS for 
measuring an HDAC polypeptide or peptide are known in the art and provide 
5 a basis for diagnosing altered or abnormal levels of HDAC polypeptide 
expression. Normal or standard values for HDAC polypeptide expression are 
established by combining body fluids or cell extracts taken from normal 
mammalian subjects, preferably human, with antibody to HDAC polypeptide 
or peptide under conditions suitable for complex formation. The amount of 

10 standard complex formation may be quantified by various methods; 
photometric means are preferred. Quantities of HDAC polypeptide or peptide 
expressed in subject sample, control sample, and disease samples from 
biopsied tissues are compared with the standard values. Deviation between 
standard and subject values establishes the parameters for diagnosing 

15 disease. 

In one embodiment of the present invention, anti-HDAC antibodies 
(e.g., anti-HDAC9c antibodies) can be used in accordance with established 
methods to detect the presence of specific cancers or tumors, such as breast 
or prostate cancers or tumors. Representative cancers and cancer types are 

20 listed above. 

According to another embodiment of the present invention, the 
polynucleotides encoding the novel HDAC polypeptides may be used for 
diagnostic purposes. The polynucleotides which may be used include 
oligonucleotide sequences, complementary RNA and DNA molecules, and 

25 PNAs. The polynucleotides may be used to detect and quantify HDAC- 
encoding nucleic acid expression in biopsied tissues in which expression (or 
under- or overexpression) of HDAC polynucleotide may be correlated with 
disease. The diagnostic assay may be used to distinguish between the 
absence, presence, and excess expression of HDAC, and to monitor 

30 regulation of HDAC polynucleotide levels during therapeutic treatment or 
intervention. 
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In a related aspect, hybridization with PCR probes which are capable 
of detecting polynucleotide sequences, including genomic sequences, 
encoding an HDAC polypeptide, or closely related molecules, may be used to 
identify nucleic acid sequences which encode an HDAC polypeptide. The 
5 specificity of the probe, whether it is made from a highly specific region, e.g., 
about 8 to 10 or 12 or 15 contiguous nucleotides in the 5' regulatory region, or 
a less specific region, e.g., especially in the 3' coding region, and the 
stringency of the hybridization or amplification (maximal, high, intermediate, or 
low) will determine whether the probe identifies only naturally occurring 
10 sequences encoding the HDAC polypeptide, alleles thereof, or related 
sequences. 

Probes may also be used for the detection of related sequences, and 
should preferably contain at least 50%, preferably at least 80%, of the 
nucleotides encoding an HDAC polypeptide. The hybridization probes of this 
15 invention may be DNA or RNA and may be derived from the nucleotide 
sequence of SEQ ID NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, 
SEQ ID NO:94, or SEQ ID NO:96, or from genomic sequence including 
promoter, enhancer elements, and introns of the naturally occurring HDAC 
protein. 

20 The nucleotide sequences of the novel HDAC genes presented herein 

will further allow for the generation of probes and primers designed for use in 
identifying and/or cloning HDAC homologs in other cell types, e.g. from other 
tissues, as well as HDAC homologs from other organisms. For example, the 
present invention also provides a probe/primer comprising a substantially 

25 purified oligonucleotide, which oligonucleotide comprises a region of 
nucleotide sequence that hybridizes under stringent conditions to at least 10 
consecutive nucleotides of sense or anti-sense sequence selected from the 
group consisting of HDAC SEQ ID NO:1, SEQ ID NO:12, SEQ ID NO:19, 
SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96, or naturally occurring 

30 mutants thereof. Primers based on the nucleic acid represented in SEQ ID 
NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, or 
SEQ ID NO:96, or as presented in the tables herein, can be used in PCR 
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reactions to clone HDAC homologs. Likewise, probes based on the HDAC 
sequences provided herein can be used to detect transcripts or genomic 
sequences encoding the same or homologous proteins. The probe preferably 
comprises a label moiety attached thereto and is able to be detected, e.g., the 
5 label moiety is selected from radioisotopes, fluorescent compounds, 
chemiluminescent compounds, enzymes, enzyme co-factors, and the like. 

Such probes can also be used as a part of a diagnostic test kit for 
identifying cells or tissue which mis-express an HDAC protein, such as by 
measuring a level of an HDAC encoding nucleic acid in a sample of cells from 
10 a patient; e.g., detecting HDAC mRNA levels, or determining whether a 
genomic HDAC gene has been mutated or deleted. To this end, nucleotide 
probes can be generated from the HDAC sequences herein which facilitate 
histological screening of intact tissue and tissue samples for the presence (or 
absence) of HDAC-encoding transcripts. Similar to the diagnostic uses of 
15 anti-HDAC antibodies, the use of probes directed to HDAC messages, or to 
genomic HDAC sequences, can be used for both predictive and therapeutic 
evaluation of allelic mutations which might be manifest in, for example, 
neoplastic or hyperplastic disorders (e.g. unwanted cell growth), or the 
abnormal differentiation of tissue. Used in conjunction with immunoassays as 
20 described herein, the oligonucleotide probes can help facilitate the 
determination of the molecular basis for a developmental disorder which may 
involve some abnormality associated with expression (or lack thereof) of an 
HDAC protein. For instance, variation in polypeptide synthesis can be 
differentiated from a mutation in a coding sequence. 

Accordingly, the present invention provides a method for determining if 
a subject is at risk for a disorder characterized by aberrant cell proliferation 
and/or differentiation. Such a method can be generally characterized as 
comprising detecting, in a sample of cells from a subject, the presence or 
absence of a genetic lesion characterized by at least one of (i) an alteration 
affecting the integrity of a gene or nucleic acid sequence encoding an HDAC 
polypeptide, or (ii) the mis-expression of an HDAC gene. To illustrate, such 
genetic lesions can be detected by ascertaining the existence of at least one 
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of (i) a deletion of one or more nucleotides from an HDAC gene, (ii) an 
addition of one or more nucleotides to an HDAC gene, (iii) a substitution of 
one or more nucleotides of an HDAC gene, (iv) a gross chromosomal 
rearrangement of an HDAC gene, (v) a gross alteration in the level of a 

5 messenger RNA transcript of an HDAC gene, (vii) aberrant modification of an 
HDAC gene, such as of the methylation pattern of the genomic DNA, (vii) the 
presence of a non-wild type splicing pattern of a messenger RNA transcript of 
an HDAC gene, (viii) a non-wild type level of an HDAC polypeptide, and (ix) 
inappropriate post-translational modification of an HDAC polypeptide. 

0 Accordingly, the present invention provides a large number of assay 
techniques for detecting lesions in an HDAC gene, and importantly, provides 
the ability to distinguish between different molecular causes underlying 
HDAC-dependent aberrant cell growth, proliferation and/or differentiation. 

Methods for producing specific hybridization probes for DNA encoding 

5 the HDAC polypeptides include the cloning of nucleic acid sequence that 
encodes the HDAC polypeptides, or HDAC derivatives, into vectors for the 
production of mRNA probes. Such vectors are known in the art, commercially 
available, and may be used to synthesize RNA probes in vitro by means of 
the addition of the appropriate RNA polymerases and the appropriate labeled 

0 nucleotides. Hybridization probes may be labeled by a variety of 
detector/reporter groups, e.g., radionuclides such as 32 P or 35 S, or enzymatic 
labels, such as alkaline phosphatase coupled to the probe via avidin/ biotin 
coupling systems, and the like. 

The polynucleotide sequences encoding the HDAC polypeptides may 

5 be used in Southern or Northern analysis, dot blot, or other membrane-based 
technologies; in PCR technologies; or in dip stick, pin, ELISA or chip assays 
utilizing fluids or tissues from patient biopsies to detect the status of, e.g., 
levels or overexpression of HDAC, or to detect altered HDAC expression. 
Such qualitative or quantitative methods are well known in the art. 

3 In a particular aspect, the nucleotide sequences encoding the HDAC 

polypeptides may be useful in assays that detect activation or induction of 
various tumors, neoplasms or cancers. The nucleotide sequences encoding 
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the HDAC polypeptides may be labeled by standard methods, and added to a 
fluid or tissue sample from a patient under conditions suitable for the 
formation of hybridization complexes. After a suitable incubation period, the 
sample is washed and the signal is quantified and compared with a standard 
5 value. If the amount of signal in the biopsied or extracted sample is 
significantly altered from that of a comparable control sample, the nucleotide 
sequence has hybridized with nucleotide sequence present in the sample, 
and the presence of altered levels of nucleotide sequence encoding the 
HDAC polypeptides in the sample indicates the presence of the associated 

10 disease. Such assays may also be used to evaluate the efficacy of a 
particular therapeutic treatment regimen in animal studies, in clinical trials, or 
in monitoring the treatment of an individual patient. 

In one embodiment of the present invention, HDAC (e.g., HDAC9c) 
nucleic acids can be used in accordance with established methods to detect 

15 the presence of specific cancers or tumors, such as breast or prostate 
cancers or tumors. Representative cancers and cancer types are listed 
herein above. 

To provide a basis for the diagnosis of disease associated with HDAC 
expression, a normal or standard profile for expression is established. This 

20 may be accomplished by combining body fluids or cell extracts taken from 
normal subjects, either animal or human, with a sequence, or a fragment 
thereof, which encodes an HDAC polypeptide, under conditions suitable for 
hybridization or amplification. Standard hybridization may be quantified by 
comparing the values obtained from normal subjects with those from an 

25 experiment where a known amount of a substantially purified polynucleotide is 
used. Standard values obtained from normal samples may be compared with 
values obtained from samples from patients who are symptomatic for disease. 
Deviation between standard and subject (patient) values is used to establish 
the presence of disease. 

30 Once disease is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to evaluate whether 
the level of expression in the patient begins to approximate that which is 
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observed in a normal individual. The results obtained from successive assays 
may be used to show the efficacy of treatment over a period ranging from 
several days to months. 

With respect to cancer, the presence of an abnormal amount of 
5 transcript in biopsied tissue from an individual may indicate a predisposition 
for the development of the disease, or may provide a means for detecting the 
disease prior to the appearance of actual clinical symptoms. A more definitive 
diagnosis of this type may allow health professionals to employ preventative 
measures or aggressive treatment earlier, thereby preventing the 

1 0 development or further progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the 
nucleic acid sequences encoding the novel HDAC polypeptides may involve 
the use of PCR. Such oligomers may be chemically synthesized, generated 
enzymatically, or produced from a recombinant source. Oligomers will 

15 preferably comprise two nucleotide sequences, one with sense orientation 
(5'-*3') and another with antisense (3'->5'), employed under optimized 
conditions for identification of a specific gene or condition. The same two 
oligomers, nested sets of oligomers, or even a degenerate pool of oligomers 
may be employed under less stringent conditions for detection and/or 

20 quantification of closely related DNA or RNA sequences. 

Methods suitable for quantifying the expression of HDAC include 
radiolabeling or biotinylating nucleotides, co-amplification of a control nucleic 
acid, and standard curves onto which the experimental results are 
interpolated (P.C. Melby et al M 1993, J. Immunol. Methods, 159:235-244; and 

25 C. Duplaa et al., 1993, Anal. Biochem., 229-236). The speed of quantifying 
multiple samples may be accelerated by running the assay in an ELISA 
format where the oligomer of interest is presented in various dilutions and a 
spectrophotometry or colorimetric response gives rapid quantification. 

In another embodiment of the present invention, oligonucleotides, or 

30 longer fragments derived from the HDAC polynucleotide sequences described 
herein, may be used as targets in a microarray. The microarray can be used 
to monitor the expression level of large numbers of genes simultaneously (to 
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produce a transcript image), and to identify genetic variants, mutations and 
polymorphisms. This information may be used to determine gene function, to 
understand the genetic basis of a disease, to diagnose disease, and to 
develop and monitor the activities of therapeutic agents. In a particular 
5 aspect, the microarray is prepared and used according to the methods 
described in WO 95/11995 (Chee et al.); DJ. Lockhart et al., 1996, Nature 
Biotechnology, 14:1675-1680; and M. Schena et al., 1996, Proc. Natl Acad. 
Sci. USA, 93:10614-10619). Microarrays are further described in U.S. Patent 
No. 6,015,702 to P. Lai et al. 

10 * In another embodiment of this invention, a nucleic acid sequence which 

encodes one or more of the novel HDAC polypeptides may also be used to 
generate hybridization probes which are useful for mapping the naturally 
occurring genomic sequence. The sequences may be mapped to a particular 
chromosome, to a specific region of a chromosome, or to artificial 

15 chromosome constructions (HACs), yeast artificial chromosomes (YACs), 
bacterial artificial chromosomes (BACs), bacterial PI constructions, or single 
chromosome cDNA libraries, as reviewed by CM. Price, 1993, Blood Rev., 
7:127-134 and by B.J. Trask, 1991, Trends Genet, 7:149-154. 

In another embodiment of the present invention, an HDAC polypeptide, 

20 its catalytic or immunogenic fragments or oligopeptides thereof, can be used 
for screening libraries of compounds in any of a variety of drug screening 
techniques. The fragment employed in such screening may be free in 
solution, affixed to a solid support, borne on a cell surface, or located 
intracellular^. The formation of binding complexes, between an HDAC 

25 polypeptide, or portion thereof, and the agent being tested, may be measured 
utilizing techniques commonly practiced in the art and as described above. 

Another technique for drug screening which may be used provides for 
high throughput screening of compounds having suitable binding affinity to the 
protein of interest as described in WO 84/03564. In this method, as applied to 

30 HDAC protein, large numbers of different small test compounds are 
synthesized on a solid substrate, such as plastic pins or some other surface. 
The test compounds are reacted with an HDAC polypeptide, or fragments 
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thereof, and washed. Bound HDAC polypeptide is then detected by methods 
well known in the art. Purified HDAC polypeptide can also be coated directly 
onto plates for use in the aforementioned drug screening techniques. 
Alternatively, non-neutralizing antibodies can be used to capture the peptide 
5 and immobilize it on a solid support. 

Other screening and small molecule (e.g., drug) detection assays 
which involve the detection or identification of small molecules that can bind to 
a given protein, i.e., an HDAC protein, are encompassed by the present 
invention. Particularly preferred are assays suitable for high throughput 

10 screening methodologies. In such binding-based screening or detection 
assays, a functional assay is not typically required. All that is needed is a 
target protein, preferably substantially purified, and a library or panel of 
compounds (e.g., ligands, drugs, small molecules) to be screened or assayed 
for binding to the protein target. Preferably, most small molecules that bind to 

15 the target protein will modulate activity in some manner, due to preferential, 
higher affinity binding to functional areas or sites on the protein. 

An example of such an assay is the fluorescence based thermal shift 
assay (3-Dimensional Pharmaceuticals, Inc., 3DP, Exton, PA) as described in 
U.S. Patent Nos. 6,020,141 and 6,036,920 to Pantoliano et al.; see also, J. 

20 Zimmerman, 2000, Gen. Eng. News 20(8)). The assay allows the detection of 
small molecules (e.g., drugs, ligands) that bind to expressed, and preferably 
purified, HDAC polypeptide based on affinity of binding determinations by 
analyzing thermal unfolding curves of protein-drug or ligand complexes. The 
drugs or binding molecules determined by this technique can be further 

25 assayed, if desired, by methods, such as those described herein, to determine 
if the molecules affect or modulate function or activity of the target protein. 

In a further embodiment of this invention, competitive drug screening 
assays can be used in which neutralizing antibodies capable of binding an 
HDAC polypeptide specifically compete with a test compound for binding to 

30 HDAC polypeptide. In this manner, the antibodies can be used to detect the 
presence of any peptide which shares one or more antigenic determinants 
with an HDAC polypeptide. 
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In yet another of its aspects, the present invention provides the 
identification of compounds with optimum therapeutic indices, or drugs or 
compounds which have therapeutic indices more favorable than known HDAC 
inhibitors, such as trapoxin, tichostatin, sodium butyrate, and the like. The 
5 identification of such compounds can be made by the use of differential 
screening assays which detect and compare drug mediated inhibition of 
deacetylase activity between two or more different HDAC-like enzymes, or 
which compare drug mediated inhibition of formation of complexes involving 
two or more different types of HDAC-like proteins. 

10 For example, an assay can be designed for side-by side comparison of 

the effect of a test compound on the deacetylase activity or protein 
interactions of tissue-type specific HDAC proteins. Given the apparent 
diversity of HDAC proteins, it is probable that different functional HDAC 
activities, or HDAC complexes, exist and in certain instances, are localized to 

15 particular tissue or cell types. Thus, test compounds can be screened to 
identify agents that are able to inhibit the tissue-specific formation of only a 
subset of the possible repertoire of HDAC/regulatory protein complexes, or 
which preferentially inhibit certain HDAC enzymes. For instance, an 
"interaction trap assay" can be derived using two or more different human 

20 HDAC "bait" proteins, while the "fish" protein is constant in each, e.g., a 
human RbAp48 construct. Running the interaction trap side- by-side permits 
the detection of agents which have a greater effect (e.g., statistically 
significant) on the formation of one of the HDAC/RbAp48 complexes than on 
the formation of the other HDAC complexes. (See, e.g., WO 97/35990). 

25 Similarly, differential screening assays can be used to exploit the 

difference in protein interactions and/or catalytic mechanisms of mammalian 
HDAC proteins and yeast RPD3 proteins, for example, in order to identify 
agents which display a statistically significant increase in specificity for 
inhibiting the yeast enzyme relative to the mammalian enzyme. Thus, lead 

30 compounds which act specifically on pathogens, such as fungus involved in 
mycotic infections, can be developed. By way of illustration, assays can be 
used to screen for agents which may ultimately be useful for inhibiting at least 
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one fungus implicated in pathologies such as candidiasis, aspergillosis, 
mucomycosis, blastomycosis, geotrichosis, cryptococcosis, 
chromoblastomycosis, coccidiomycosis, conidiosporosis, histoplasmosis, 
maduromycosis, rhinosporidosis, nocardiosis, para actinomycosis, 

5 penicilliosis, monoliasis, or sporotrichosis. 

As an example, if the mycotic infection to which treatment is desired is 
candidiasis, the described assay can involve comparing the relative 
effectiveness of a test compound on inhibiting the deacetylase activity of a 
mammalian HDAC protein with its effectiveness in inhibiting the deacetylase 

0 activity of an RPD3 homolog that has been cloned from yeast selected from 
the group consisting of Candida albicans, Candida stellatoidea, Candida 
tropicalis, Candida parapsilosis, Candida krusei, Candida pseudotropicalis, 
Candida quillermondii, or Candida rugosa. Such an assay can also be used to 
identify anti-fungal agents which may have therapeutic value in the treatment 

5 of aspergillosis by selectively targeting RPD3 homologs cloned from yeast 
such as Aspergillus fumigatus, Aspergillus flavus, Aspergillus niger, 
Aspergillus nidulans, or Aspergillus terreus. Where the mycotic infection is 
muco-mycosis, the RPD3 deacetylase can be derived from yeast such as 
Rhizopus arrhizus, Rhizopus oryzae, Absidja corymbiera, Absidia ramosa, or 

0 Mucor pusillus. 

Sources of other RPD3 activities for comparison with a mammalian HDAC 
activity include the pathogen Pneumocystis carinii 

In addition to such HDAC therapeutic uses, anti-fungal agents 
developed from such differential screening assays can be used, for example, 

5 as preservatives in foodstuff, feed supplement for promoting weight gain in 
livestock, or in disinfectant formulations for treatment of non-living matter, 
e.g., for decontaminating hospital equipment and rooms. In a similar fashion, 
side by side comparison of the inhibition of a mammalian HDAC protein and 
an insect HDAC-related protein, will permit selection of HDAC inhibitors which 

0 are capable of discriminating between the human/mammalian and insect 
enzymes. Accordingly, the present invention envisions the use and 
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formulations of HDAC therapeutics in insecticides, such as for use in 
management of insects like the fruit fly. 

In yet another embodiment, certain of the subject HDAC inhibitors can 
be selected on the basis of inhibitory specificity for plant HDAC-related 
5 activities relative to the mammalian enzyme. For example, a plant HDAC- 
related protein can be disposed in a differential screen with one or more of the 
human enzymes to select those compounds of greatest selectivity for 
inhibiting the plant enzyme. Thus, the present invention specifically 
contemplates formulations of HDAC inhibitors for agricultural applications, 

1 0 such as in the form of a defoliant or the like. 

In many drug screening programs that test libraries of compounds and 
natural extracts, high throughput assays are desirable in order to maximize 
the number of compounds surveyed in a given period of time. Assays 
performed in cell-free systems, such as may be derived with purified or semi- 

1 5 purified proteins, are often preferred as "primary" screens in that they can be 
rapidly generated to permit the quick development and relatively easy 
detection of an alteration in a molecular target which is mediated by a test 
compound. In addition, the effects of cellular toxicity and/or bioavailability of 
the test compound can be generally ignored in an in vitro system, since the 

20 assay is focused primarily on the effect of the drug on the molecular target 
which may be manifest in an alteration of binding affinity with upstream or 
downstream elements. 

Accordingly, in an exemplary screening assay, a reaction mixture is 
generated to include an HDAC polypeptide, compound(s) of interest, and a 

25 "target polypeptide", e.g., a protein, which interacts with the HDAC 
polypeptide, whether as a substrate or by some other protein-protein 
interaction. Exemplary target polypeptides include histones, RbAp48 
polypeptides, p53 polypeptides, and/or combinations thereof, or with other 
transcriptional regulatory proteins (such as myc, max, etc.). Detection and 

30 quantification of complexes containing the HDAC protein provide a means for 
determining a compound's efficacy at inhibiting (or potentiating) complex 
formation between the HDAC and the target polypeptide. The efficacy of the 
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compound can be assessed by generating dose response curves from data 
obtained using various concentrations of the test compound. Moreover, a 
control assay can also be performed to provide a baseline for comparison. In 
the control assay, isolated and purified HDAC polypeptide is added to a 
5 composition containing the target polypeptide and the formation of a complex 
is quantified in the absence of the test compound. 

Complex formation between an HDAC polypeptide and the target 
polypeptide may be detected by a variety of techniques. Modulation of the 
formation of complexes can be quantified using, for example, detectably 

10 labeled proteins such as radiolabeled, fluorescently labeled, or enzymatically 
labeled HDAC polypeptides, by immunoassay, by chromatography, or by 
detecting the intrinsic activity of the acetylase. 
Transgenics and Knock Outs 

The present invention further encompasses transgenic non-human 

15 mammals, preferably mice, that comprise a recombinant expression vector 
harboring a nucleic acid sequence that encodes a human HDAC (e.g., SEQ 
ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, or 
SEQ ID NO:95). 

Transgenic non-human mammals useful to produce recombinant 
20 proteins are well known to the skilled practitioner, as are the expression 
vectors necessary and the techniques for generating transgenic animals. 
Generally, the transgenic animal comprises a recombinant expression vector 
in which the nucleotide sequence that encodes a human HDAC is operably 
linked to a tissue specific promoter whereby the coding sequence is only 
25 expressed in that specific tissue. For example, the tissue specific promoter 
can be a mammary cell specific promoter and the recombinant protein so 
expressed is recovered from the animal's milk. 

The transgenic animals, particularly transgenic mice, containing a 
nucleic acid molecule which encodes a novel human HDAC may be used as 
30 animal models for studying in vivo the overexpression of HDAC and for use in 
drug evaluation and discovery efforts to find compounds effective to inhibit or 
modulate the activity of HDAC, such as for example compounds for treating 
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disorders, diseases, or conditions related to cell proliferation and neoplastic 
cell growth, for example. One having ordinary skill in the art using standard 
techniques, such as those taught in U.S. Patent No. 4,873,191, issued Oct. 
10, 1989 to Wagner and in U.S. Patent No. 4,736,866, issued April 12, 1988 
5 to Leder, can produce transgenic animals which produce human HDAC, and 
use the animals in drug evaluation and discovery projects. 

The transgenic non-human animals according to this aspect of the 
present invention can express a heterologous HDAC-encoding gene, or which 
have had one or more genomic HDAC genes disrupted in at least one of the 

10 tissue or cell types of the animal. Accordingly, the invention features an 
animal model for developmental diseases, which animal has one or more 
HDAC alleles which are improperly expressed. For example, a mouse can be 
bred which has one or more HDAC alleles deleted or otherwise rendered 
inactive. Such a mouse model can then be used to study disorders arising 

15 from improperly expressed HDAC genes, as well as for evaluating potential 
therapies for similar disorders. 

Another aspect of transgenic animals are those animals which contain 
cells harboring an HDAC transgene according to the present invention and 
which preferably express an exogenous HDAC protein in one or more cells in 

20 the animal. An HDAC transgene can encode the wild-type form of the protein, 
or can encode homologs thereof, including both agonists and antagonists, as 
well as antisense constructs. Preferably, the expression of the transgene is 
restricted to specific subsets of cells, tissues or developmental stages 
utilizing, for example, cis-acting sequences that control expression in the 

25 desired pattern. According to the invention, such mosaic expression of an 
HDAC protein can be essential for many forms of lineage analysis and can 
also provide a means to assess the effects of, for example, lack of HDAC 
expression which might grossly alter development in small portions of tissue 
within an otherwise normal embryo. Toward this end, tissue specific 

30 regulatory sequences and conditional regulatory sequences can be used to 
control the expression of the transgene in certain spatial patterns. Moreover, 
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temporal patterns of expression can be provided by, for example, conditional 
recombination systems or prokaryotic transcriptional regulatory sequences. 

Genetic techniques which allow for the expression of transgenes can 
be regulated via site-specific genetic manipulation in vivo are known to those 
5 skilled in the art. For instance, genetic systems are available which permit the 
regulated expression of a recombinase that catalyzes the genetic 
recombination of a target sequence. The phrase "target sequence" in this 
instance refers to a nucleotide sequence that is genetically recombined by a 
recombinase. The target sequence is flanked by recombinase recognition 

10 sequences and is generally either excised or inverted in cells expressing 
recombinase activity. Recombinase catalyzed recombination events can be 
designed such that recombination of the target sequence results in either the 
activation or repression of expression of one of the present HDAC proteins. 

For example, excision of a target sequence which interferes with the 

15 expression of a recombinant HDAC gene, such as one which encodes an 
antagonistic homolog or an antisense transcript, can be designed to activate 
the expression of that gene. This interference with expression of an encoded 
product can result from a variety of mechanisms, such as spatial separation of 
the HDAC gene from the promoter element, or an internal stop codon. 

20 Moreover, the transgene can be made so that the coding sequence of the 
gene is flanked by recombinase recognition sequences and is initially 
transfected into cells in a 3' to 5' orientation with respect to the promoter 
element. In this case, inversion of the target sequence will reorient the 
subject gene by placing the 5' end of the coding sequence in an orientation 

25 with respect to the promoter element which allows for promoter driven 
transcriptional activation. 

Illustratively, transgenic non-human animals are produced by 
introducing transgenes into the germline of the non-human animal. Embryonic 
target cells at various developmental stages can be used to introduce 

30 transgenes. Different methods are used depending on the stage of 
development of the embryonic target cell. The zygote is a preferred target for 
micro-injection. 
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In the mouse, the male pronucleus reaches the size of approximately 
20 micrometers in diameter which allows reproducible injection of 1-2pl of 
DNA solution. The use of zygotes as a target for gene transfer has a major 
advantage in that in most cases the injected DNA will be incorporated into the 

5 host gene before the first cleavage (e.g., Brinster et al., 1985, Proc. Natl. 
Acad. Sci. USA, 82:4438-4442). As a consequence, all cells of the transgenic 
non-human animal will carry the incorporated transgene. This will generally 
also be reflected in the efficient transmission of the transgene to offspring of 
the founder mice since 50% of the germ cells will harbor the transgene. 

0 Microinjection of zygotes is the preferred method for incorporating HDAC 
transgenes. 

In addition, retroviral infection can also be used to introduce HDAC 
transgenes into a non human animal. The developing non-human embryo 
can be cultured in vitro to the blastocyst stage. During this time, the 

5 blastomeres are targets for retroviral infection (R. Jaenisch, 1976, Proc. Natl. 
Acad. Sci. USA., 73:1260-1264). Efficient infection of the blastomeres is 
obtained by enzymatic treatment to remove the zona pellucida (Manipulating 
the Mouse Embryo, Hogan eds. (Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, 1986). The viral vector system used to introduce the 

0 transgene is typically a replication-defective retrovirus carrying the transgene 
(Jahner et al., 1985, Proc. Natl. Acad. Sci. USA., 82:6927 6931; Van der 
Putten et al., 1985, Proc. Natl. Acad. Sci. USA, 82:6148-6152). Transfection 
is easily and efficiently obtained by culturing the blastomeres on a monolayer 
of virus-producing cells (Stewart et al., 1987, EMBOJ., 6:383-388). 

5 Alternatively, infection can be performed at a later developmental 

stage. For example, virus or virus-producing cells can be injected into the 
blastocoele (e.g., Jahner et al., 1982, Nature, 298:623-628). Most of the 
founder animals win be mosaic for the transgene, because incorporation 
occurs only in the subset of cells which formed the transgenic non-human 

D animal. Further, the founders may contain various retroviral insertions of the 
transgene at different positions in the genome which generally will segregate 
in the offspring. It is also possible to introduce transgenes into the germline 
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by intrauterine retroviral infection of the midgestation embryo (Jahner et al. f 
1982, supra). 

A third type of target cell for transgene introduction is the embryonic 
stem cell (ES). ES cells are obtained from pre-implantation embryos that are 
5 cultured in vitro and fused with embryos (Evans et al., 1981, Nature, 292:154- 
156; Bradley et al., 1984, Nature, 309:255-258; Gossler et al., 1986, Proc. 
Natl. Acad Sci. USA., 83:9065-9069; and Robertson et al., 1986, Nature, 
322:445-448). Cultured ES cell lines are available. Transgenes can be 
efficiently introduced into the ES cells by DNA transfection or by retrovirus- 

10 mediated transduction. Transformed ES cells can thereafter be combined 
with blastocysts from a non-human animal. The ES cells then colonize the 
embryo and contribute to the germ line of the resulting chimeric animal. See, 
e.g., R. Jaenisch, 1988, Science, 240:1468-1474. 

Methods for making HDAC knock-out animals, or disruption transgenic 

15 animals are also generally known. See, for example, Manipulating the Mouse 
Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 
1986). Recombinase dependent knockouts can also be generated, e.g. by 
homologous recombination, to insert recombinase target sequences flanking 
portions of an endogenous HDAC gene, such that tissue specific and/or 

20 temporal control of inactivation of an HDAC gene sequence or allele can be 
controlled as above. 

In knock-outs, transgenic mice may be generated which are 
homozygous for a mutated, non-functional HDAC gene which is introduced 
into the animals using well known techniques. Surviving knock-out mice 

25 produce no functional HDAC and thus are useful to study the function of 
HDAC. Furthermore, the mice may be used in assays to study the effects of 
test compounds in HDAC deficient animals. For instance, HDAC-deficient 
mice can be used to determine if, how and to what extent HDAC inhibitors will 
effect the animal and thus address concerns associated with inhibiting the 

30 activity of the molecule. 

More specifically, methods of generating genetically deficient knock-out 
mice are well known and are disclosed in M.R. Capecchi, 1989, Science, 
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244:1288-1292 and P. Li et al., 1995, Cell, 80:401-411. For example, a 
human HDAC cDNA clone can be used to isolate a murine HDAC genomic 
clone. The genomic clone can be used to prepare an HDAC targeting 
construct which can disrupt the HDAC gene in the mouse by homologous 

5 recombination. The targeting construct contains a non-functioning portion of 
an HDAC gene which inserts in place of the functioning portion of the native 
mouse gene. The non-functioning insert generally contains an insertion in the 
exon that encodes the active region of the HDAC polypeptide. The targeting 
construct can contain markers for both positive and negative selection. The 

0 positive selection marker allows for the selective elimination of cells which do 
not carry the marker, while the negative selection marker allows for the 
elimination of cells that carry the marker. 

For example, a first selectable marker is a positive marker that will 
allow for the survival of cells carrying it. In some instances, the first selectable 

5 marker is an antibiotic resistance gene, such as the neomycin resistance 
gene', which can be placed within the coding sequence of a novel HDAC gene 
to render it non-functional, while at the same time rendering the construct 
selectable. The antibiotic resistance gene is within the homologous region 
which can recombine with native sequences. Thus, upon homologous 

0 recombination, the non-functional and antibiotic resistance selectable gene 
sequences will be taken up. Knock-out mice may be used as models for 
studying inflammation-related disorders and screening compounds for treating 
these disorders. 

The targeting construct also contains a second selectable marker 
5 which is a negative selectable marker. Cells with the negative selectable 
marker will be eliminated. The second selectable marker is outside the 
recombination region. Thus, if the entire construct is present in the cell, both 
markers will be present. If the construct has recombined with native 
sequences, the first selectable marker will be incorporated into the genome 
3 and the second will be lost. The herpes simplex virus thymidine kinase (HSV 
tk) gene is an example of a negative selectable marker which can be used as 
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a second marker to eliminate cells that carry it. Cells with the HSV tk gene 
are selectively killed in the presence of gangcyclovir. 

Cells are transfected with targeting constructs and then selected for the 
presence of the first selection marker and the absence of the second. 
5 Constructs / DNA are then injected into the blastocyst stage and implanted 
into pseudopregnant females. Chimeric offspring which are capable of 
transferring the recombinant genes in their germline are selected, mated and 
their offspring examined for heterozygous carriers of the recombined genes. 
Mating of the heterozygous offspring can then be used to generate fully 
10 homozygous offspring which constitute HDAC-deficient knock-out mice. 
Embodiments of the Invention 

• An isolated polynucleotide encoding a histone deacetylase polypeptide 
comprising an amino acid sequence selected from the group consisting of 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID 

15 NO:93, and SEQ ID NO:95. 

• An isolated polynucleotide encoding an amino acid sequence selected 
from the group consisting of: 

a. an amino acid sequence comprising residues 1009-1069 
of SEQ ID NO:87; and 
20 b. an amino acid sequence comprising residues 720-780 of SEQ 

IDNO:93. 

• An isolated polynucleotide comprising a nucleotide sequence selected 
from the group consisting of SEQ ID NO:1, SEQ ID NO:12, SEQ ID 
NO:19, SEQ ID NO:88, SEQ ID NO:94, and SEQ ID NO:96. 

25 • An isolated polynucleotide comprising a nucleotide sequence selected 
from the group consisting of: 

a. a nucleotide sequence which is at least 60% identical to 
SEQIDNO:1; 

b. a nucleotide sequence which is at least 60% identical to 
30 SEQ ID NO:12; 

c. a nucleotide sequence which is at least 60% identical to 
SEQ ID NO:19; 
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d. a nucleotide sequence which is at least 67.8% identical to 
SEQ ID NO:88; 

e. a nucleotide sequence which is at least 70% identical to SEQ ID 
NO:94; 

5 f. a nucleotide sequence which is at least 59.8% identical to SEQ 

ID NO:96; 9- 

a nucleotide sequence which is at least 94.4% identical to nucleotides 
1 to 3207 of SEQ ID NO:88; h. 
a nucleotide sequence which is at least 55.4% identical to nucleotides 
10 307 to 1791 of SEQ ID NO:96. i. 

a nucleotide sequence comprising nucleotides 1 to 3207 of SEQ ID 
NO:88; j. a 

nucleotide sequence comprising nucleotides 1 to 2340 of SEQ ID NO:94; 

k. a 

15 nucleotide sequence comprising nucleotides 307 to 1791 of SEQ ID 
NO:96; I- 

a nucleotide sequence comprising nucleotides 4 to 3207 of SEQ ID 
NO:88 wherein said nucleotides encode amino acids 2 to 1069 of SEQ ID 
NO:87 lacking the start methionine; and m. a 

20 nucleotide sequence comprising nucleotides 310 to 1791 of SEQ ID 
NO:96 wherein said nucleotides encode amino acids 2 to 495 of SEQ ID 
NO:95 lacking the start methionine. 
• An isolated polynucleotide comprising a nucleotide sequence selected 
from the group consisting of: 

25 a. a nucleotide sequence comprising at least 25 contiguous 

nucleotides of SEQ ID NOI ; b. 

a nucleotide sequence comprising at least 25 contiguous nucleotides of 
SEQIDNO:12; c. a 

nucleotide sequence comprising at least 25 contiguous nucleotides of 

30 SEQIDNO:19; d. a 

nucleotide sequence comprising at least 2755 contiguous nucleotides of 
SEQ ID NO:88; e. a 
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nucleotide sequence comprising at least 2160 contiguous nucleotides of 
SEQIDNO:94; f. a 

nucleotide sequence comprising at least 1195 contiguous nucleotides of 
SEQIDNO:96; g. a 

5 nucleotide sequence comprising at least 183 contiguous nucleotides of 
SEQ ID NO:88; and h. a 

nucleotide sequence comprising at least 17 contiguous nucleotides of 
SEQ ID NO:96. 

• An isolated polynucleotide comprising a nucleotide sequence selected 
10 from the group consisting of: 

a. a nucleotide sequence comprising nucleotides 3024-4467 
of SEQ ID NO:88; 

b. a nucleotide sequence comprising nucleotides 2156-3650 
of SEQ ID NO:94; 

15 c. a nucleotide sequence comprising nucleotides 1 1 74-3391 

of SEQ ID NO:96; . 

d. a nucleotide sequence comprising nucleotides 3024-3207 
of SEQ ID NO:88; and 

e. a nucleotide sequence comprising nucleotides 1174-1791 of 
20 SEQIDNO:96. 

• An primer comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:24-27, SEQ ID NO:28-35, SEQ ID NO:39-46, 
SEQ ID NO:47-62, SEQ ID NO:65-66, SEQ ID NO:67-74, SEQ ID NO:75- 
82, and SEQ ID NO:104-105. 

25 • A probe comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:36, SEQ ID NO:63-64, SEQ ID NO:83-86, SEQ 
ID N092, and SEQ ID NO:101-103. 

• A cell line comprising the isolated polynucleotide according to any one of 
the preceding embodiments. 

30 • A gene delivery vector comprising the isolated polynucleotide according to 
any one of the preceding embodiments. 
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• An expression vector comprising the isolated polynucleotide according to 
any one of the preceding embodiments. 

• A host cell comprising the expression vector according to any one of the 
preceding embodiments, wherein the host cell is selected from the group 

5 consisting of bacterial, yeast, insect, mammalian, and human cells. 

• An isolated polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ 
ID NO:87, SEQ ID NO:93, and SEQ ID NO:95. 

• An isolated polypeptide comprising an amino acid sequence selected from 
1 0 the group consisting of: 

a. an amino acid sequence which is at least 72% identical to SEQ 



ID NO:2: 
b. 

ID NO:4 
15 c. 



an amino acid sequence which is at least 79% identical to SEQ 
an amino acid sequence which is at least 70% identical to SEQ 



ID NO:5; 

d. an amino acid sequence which is at least 94.2% identical to 
SEQ ID NO:87; e. 

an amino acid sequence which is at least 95% identical to SEQ ID 
20 NO:93; and f. 

an amino acid sequence which is at least 55.3% identical to SEQ ID 
NO:95. 

• An isolated polypeptide comprising an amino acid sequence selected from 
the group consisting of: 

25 a. an amino acid sequence comprising at least 8 contiguous 

amino acids of SEQ ID NO:2; b. 

an amino acid sequence comprising at least 8 contiguous amino acids 
of SEQ ID NO:4; c. an amino 

acid sequence comprising at least 8 contiguous amino acids of SEQ ID 

30 NO:5; d. an amino acid 

sequence comprising at least 920 contiguous amino acids of SEQ ID 
NO:87; e. an amino acid 
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sequence comprising at least 720 contiguous amino acids of SEQ ID 
NO:93; and f. an amino acid 

sequence comprising at least 400 contiguous amino acids of SEQ ID 
NO:95. 

5 • An isolated polypeptide comprising an amino acid sequence selected from 
the group consisting of: 

a. an amino acid sequence comprising residues 1009-1069 
of SEQ ID NO:87; and 

b. an amino acid sequence comprising residues 720-780 of SEQ 
10 IDNO:93. 

• An isolated fusion protein comprising the isolated polypeptide according to 
any one of the preceding embodiments. 

• An antibody which binds specifically to the isolated polypeptide according 
to any one of the preceding embodiments, wherein the antibody is 

15 selected from the group consisting of polyclonal and monoclonal 
antibodies. 

• An antibody which binds specifically to the isolated fusion protein 
according to any one of the preceding embodiments. 

• An antisense polynucleotide comprising a nucleotide sequence that is 
20 complementary to at least 20 contiguous nucleotides of the isolated 

polynucleotide according to any one of the preceding embodiments. 

• An antisense polynucleotide comprising a nucleotide sequence selected 
from the group consisting of SEQ ID NO:36, SEQ ID NO:63-64, and SEQ 
ID NO:83-86. 

25 • An expression vector comprising the antisense polynucleotide according to 
any one of the preceding embodiments. 

• A pharmaceutical composition comprising the monoclonal antibody 
according to any one of the preceding embodiments, and a physiologically 
acceptable carrier, diluent, or excipient. 

30 • A pharmaceutical composition comprising the antisense polynucleotide 
according to any one of the preceding embodiments and a physiologically 
acceptable carrier, diluent, or excipient. 
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• A pharmaceutical composition comprising the expression vector according 
to any one of the preceding embodiments, and a physiologically 
acceptable carrier, diluent, or excipient. 

• A pharmaceutical composition comprising the gene delivery vector 
5 according to any one of the preceding embodiments, and a physiologically 

acceptable carrier, diluent, or excipient. 

• A pharmaceutical composition comprising the host cell according to any 
one of the preceding embodiments, and a physiologically acceptable 
carrier, diluent, or excipient. 

10 • A pharmaceutical composition comprising the modulating agent according 
to any one of the following embodiments, and a physiologically acceptable 
carrier, diluent, or excipient. 

• A method of treating cancer comprising administering the pharmaceutical 
composition according to any one of the preceding embodiments in an 

1 5 amount effective for treating the cancer. 

In various aspects, the cancer is selected from the group 
consisting of bladder cancer, lung cancer, breast cancer, colon cancer, 
rectal cancer, endometrial cancer, ovarian cancer, head and neck cancer, 
prostate cancer, and melanoma. 

20 In other aspects, the breast cancer is selected from the group 

consisting of ductal carcinoma in situ, intraductal carcinoma lobular 
carcinoma in situ, papillary carcinoma, and comedocarcinoma, 
adenocarcinomas, and carcinomas, such as infiltrating ductal carcinoma, 
infiltrating lobular carcinoma, infiltrating ductal and lobular carcinoma, 

25 medullary carcinoma, mucinous carcinoma, comedocarcinoma, Paget's 
Disease, papillary carcinoma, tubular carcinoma, and inflammatory 
carcinoma. 

In further aspects, the prostate cancer is selected from the 
group consisting of adenocarcinomas and sarcomas, and pre-cancerous 
30 conditions, such as prostate intraepithelial neoplasia. 

• A method of diagnosing a cancer comprising: 

a. incubating the isolated polynucleotide according to any 
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one of the preceding embodiments with a biological sample under 
conditions to allow the isolated polynucleotide to amplify a polynucleotide 
in the sample to produce a amplification product; and 

b. measuring levels of amplification product formed in (a), 
5 wherein an alteration in these levels compared to standard levels indicates 
diagnosis of the cancer. 

In various aspects, the cancer is selected from the group consisting of 
bladder cancer, lung cancer, breast cancer, colon cancer, rectal cancer, 
endometrial cancer, ovarian cancer, head and neck cancer, prostate 
10 cancer, and melanoma. In 
other aspects, the breast cancer is selected from the group consisting of 
ductal carcinoma in situ, intraductal carcinoma lobular carcinoma in situ, 
papillary carcinoma, and comedocarcinoma, adenocarcinomas, and 
carcinomas, such as infiltrating ductal carcinoma, infiltrating lobular 
15 carcinoma, infiltrating ductal and lobular carcinoma, medullary carcinoma, 
mucinous carcinoma, comedocarcinoma, Paget's Disease, papillary 
carcinoma, tubular carcinoma, and inflammatory carcinoma. 

In further 

aspects, the prostate cancer is selected from the group consisting of 
20 adenocarcinomas and sarcomas, and pre-cancerous conditions, such as 
prostate intraepithelial neoplasia. 
• A method of diagnosing cancer comprising: 

a. contacting the antibody according to any one of the 
preceding embodiments with a biological sample under conditions to allow 

25 the antibody to associate with a polypeptide in the sample to form a 
complex; and 

b. measuring levels of complex formed in (a), wherein an 
alteration in these levels compared to standard levels indicates diagnosis 
of the cancer. 

30 In various aspects, the cancer is selected from the group 

consisting of bladder cancer, lung cancer, breast cancer, colon cancer, 
rectal cancer, endometrial cancer, ovarian cancer, head and neck cancer, 
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prostate cancer, and melanoma. 

In other aspects, the breast cancer is selected from the group 
consisting of ductal carcinoma in situ, intraductal carcinoma lobular 
carcinoma in situ, papillary carcinoma, and comedocarcinoma, 
5 adenocarcinomas, and carcinomas, such as infiltrating ductal carcinoma, 
infiltrating lobular carcinoma, infiltrating ductal and lobular carcinoma, 
medullary carcinoma, mucinous carcinoma, comedocarcinoma, Paget's 
Disease, papillary carcinoma, tubular carcinoma, and inflammatory 
carcinoma. 

10 In further aspects, the prostate cancer is selected from the 

group consisting of adenocarcinomas and sarcomas, and pre-cancerous 
conditions, such as prostate intraepithelial neoplasia. 

• A method of detecting a histone deacetylase polynucleotide comprising: 

a. incubating the isolated polynucleotide according to any 
15 one of the preceding embodiments with a biological sample under 
conditions to allow the polynucleotide to hybridize with a polynucleotide in 
the sample to form a complex; and 

b. identifying the complex formed in (a), wherein identification of 
the complex indicates detection of a histone deacetylase polynucleotide. 
20 • A method of detecting a histone deacetylase polypeptide comprising: 

a. incubating the antibody according to any one of the 
preceding embodiments with a biological sample under conditions to allow 
the antibody to associate with a polypeptide in the sample to form a 
complex; and 

25 b. identifying the complex formed in (a), wherein 

identification of the complex indicates detection of a histone deacetylase 
polypeptide. 

• A method of screening test agents to identify modulating agents capable of 
altering deacetylase activity of a histone deacetylase polypeptide 

30 comprising: 

a. contacting the isolated polypeptide according to any one 
of the preceding embodiments with test agents under conditions to allow 
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the polypeptide to associate with one or more test agents; and 

b. selecting test agents that alter the deacetylase activity of the 
polypeptide, whereby this alteration indicates identification of modulating 
agents. In 
5 various aspects, the modulating agents are selected from the group 

consisting of antagonists and inhibitors of histone deacetylase activity. 

In 

other aspects, the modulating agents are selected from the group 
consisting of agonists or activators of histone deacetylase activity. 
10 • A method for screening test agents to identify modulating agents which 
inhibit or antagonize deacetylation activity of a histone deacetylase, 
comprising: 

a. combining an isolated polypeptide according any one of 
the preceding embodiments having a histone deacetylase activity with a 

15 histone deacetylase substrate and a test agent in a reaction mixture; and 

b. determining the conversion of the substrate to product; 
wherein a statistically significant decrease in the conversion of the 
substrate in the presence of the test agent indicates identification of a 
modulating agent which inhibits or antagonizes the deacetylation activity of 

20 histone deacetylase. 

• A method for screening test agents to identify modulating agents that 
inhibit or antagonize interaction of histone deacetylase with a histone 
deacetylase binding protein, comprising: 

a. combining the isolated polypeptide according any one of 
25 the preceding embodiments having a histone deacetylase activity with the 
histone deacetylase binding protein and a test agent in a reaction mixture; 
and 

b. detecting the interaction of the polypeptide with the histone 
deacetylase binding protein to form a complex; wherein a statistically 
30 significant decrease in the interaction of the polypeptide and protein in the 
presence of the test agent .indicates identification of a modulating agent 
which inhibits or antagonizes interaction of the histone deacetylase 
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polypeptide with the histone deacetylase binding protein. 

In various aspects, one or both of the histone deacetylase polypeptide 
and the histone deacetylase binding protein is a fusion protein. 

In other 

5 aspects, at least one of the histone deacetylase polypeptide and the 
histone deacetylase binding protein comprises a detectable label for 
detecting the formation of the complex. In a 

further aspect, the interaction of the histone deacetylase polypeptide and 
the histone deacetylase binding protein is detected in a two-hybrid assay 
10 system. 

• A method of screening a library of molecules or compounds to identify at 
least one molecule or compound therein which specifically binds to a 
histone deacetylase polynucleotide, comprising: 

a. combining the isolated polynucleotide according to any 

15 one of the preceding embodiments with a library of molecules or 
compounds under conditions to allow specific binding of the polynucleotide 
to at least one of the molecules or compounds; and b. 

detecting the specific binding in (a), thereby identifying a molecule or 
compound which specifically binds to the histone deacetylase 

20 polynucleotide. In various aspects, the library comprises molecules 

selected from the group consisting of selected from the group consisting of 
DNA molecules, RNA molecules, artificial chromosomes, PNAs, peptides, 
and polypeptides. In one aspect, 

the detecting is performed by the use of high throughput screening. 

25 • A method of treating a disease or disorder associated with abnormal cell 
growth or proliferation in a mammal comprising administrating the 
antagonist or inhibitor of histone deacetylase polypeptide according to any 
one of the preceding embodiments in an amount effective to treat the 
disease or disorder. 

30 In various aspects, the disease or disorder is selected from neoplasms, 

tumors and cancers. 
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• A method of treating a disease or disorder associated with abnormal cell 
growth or proliferation in a mammal comprising administrating the 
antisense polynucleotide according to any one of the preceding 
embodiments in an amount effective to treat the disease or disorder. 

5 In various aspects, the disease or disorder is selected from 

neoplasms, tumors and cancers. 

• A method of modulating one or more of cell growth or proliferation, cell 
differentiation, or cell survival of a eukaryotic cell, comprising combining 
the cell with an effective amount of a modulating agent that alters the 

10 deacetylase activity of a histone deacetylase polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, and SEQ ID 
NO:95, and thereby modulating the rate of one or more of cell growth or 
proliferation, cell differentiation, or cell survival of the eukaryotic cell, 

15 relative to the effect on the eukaryotic cells in the absence of the 
modulating agent. 

EXAMPLES 

The Examples below are provided to illustrate the subject invention and 
are not intended to limit the invention in any way. 

20 EXAMPLE 1: IDENTIFICATION OF NOVEL HDAC GENE FRAGMENTS 

Gene fragments encoding the novel HDAC (HDAL) polypeptides of this 
invention were identified by a combination of the following methods. 
Homology-based searches using the TBLASTN program (S.F. Altschul et al., 
1997, Nucl. Acids Res., 25(17):3389-3402) were performed to compare 

25 known histone deacetylases with human genomic (gDNA) and EST 
sequences. EST or gDNA sequences having significant homology to one or 
more of phosphatases (expect score less than or equal to IxlO" 3 ) were 
retained for further analysis. 

Hidden Markov Model (HMM) searches using PFAM motifs (listed in 

30 Table 2) (A. Bateman et al., 1999, Nucleic Acids Research, 27:260-262 and 
E.L. Sonnhammer et al., 1997, Proteins, 28(3):405-420) to search human 
genomic sequence using the Genewise program. EST or gDNA sequences 
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having a significant score (greater than or equal to 10) with any of the 
following motifs were retained for further analysis. 

HMM searches using PFAM motifs (listed in Table 1) to search 
predicted protein sequences identified by GENSCAN analysis of human 
5 genomic sequence (C. Burge and S. Karlin, 1997, J. Mol. Biol., 268(1 ):78-94). 
gDNA sequences having a significant score (greater than or equal to 10) with 
any of the following motifs were retained for further analysis. 



Table 1 : PFAM motifs used to identify histone deacetylases 



Motif Name 


PFAM Accession # 


Description 


Hist_deacetyl 


PF00850 


Histone deacetylase family 
(length 342) 



10 

Once a bacterial artificial chromosome (BAC) encoding a novel histone 
deacetylase-like protein was identified by any of the methods listed above, its 
predicted protein sequence was used to identify the most closely related 
known histone deacetylase using the BLASTP program(NCBI). This known 

15 protein was used as the query for a GenewiseDB search of the original BAC 
and all nearby BACs (identified by the Golden Path tiling map, UCSC). The 
results were used to identify additional potential exons, intron/exon 
boundaries, partial transcript cDNA sequence and partial predicted protein 
sequence for the novel HDAC gene. The Primer3 program (S. Rozen et al. f 

20 1998, 0.6 Ed., Whitehead Institute Center for Genomic Research, Cambridge, 
MA) was used to design PCR primers within single exons and between 
adjacent exons and to design antisense 80mer probes for use in isolating 
cDNA clones. 

EXAMPLE 2: ANALYSIS OF HDACs 

25 Enzymatic Activity Measurements 

Constructs representing the open reading frames of the identified novel 
sequences are engineered in frame with c-MYC or FLAG epitopes using 
commercially available mammalian expression vectors. These plasmids are 
transfected into HEK293 or COS7 cells and novel HDAC protein expression 

30 are analyzed by Western -blot analysis of protein lysates from the 
transfectants using anti-MYC epitope or anti-FLAG epitope antibodies. 
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MYC or FLAG tagged-HDAC proteins are immunoprecipitated from the 
lysates and incubated with { 3 H} acetate- or fluorescent-labeled acetylated 
proteins. Release of { 3 H} acetate or decrease in fluorescent signal intensity is 
used to establish the activity of the putative HDACs. The effects of pan- 
5 HDAC chemical inhibitors on the enzymatic activity of the novel HDACs is 
also assessed and compared with the activity of known HDAC proteins and 
their inhibition with these chemical agents. 
Transcriptional Assays 

HDAC proteins have been shown to positively or negative regulate 

10 transcriptional pathways. The ability of the novel HDAC proteins to repress or 
activate the constitutive or regulated activity of transcriptional reporter 
plasmids is assessed. These assays are performed using transient 
transfections of mammalian expression constructs encoding the novel HDAC 
proteins with reporter plasmid constructs of containing response elements of 

15 specific transcriptional pathways (e.g., p53, AP1, androgen receptor, 
LEF1/TCF4), a minimal promoter and a reporter gene product (e.g., alkaline 
phosphatase, luciferase, green fluorescent protein). 

Alternatively, the novel HDACs are transfected into cell lines 
engineered to stably express these transcriptional reporter plasmids. 

20 Because the consequence of HDAC expression could be inhibitory or 
stimulatory, the effects of the novel HDAC proteins on these transcriptional 
responses are monitored in the presence and absence of activators of the 
pathway. Similar to enzymatic activity measurements, pan-inhibitors of the 
known HDACs are also examined to establish the enzymatic activity of the 

25 novel HDAC gene products as protein deacetylases. 
Expression Analysis 

Initial insights into the role of the novel HDACs in normal physiology 
and disease states is assessed by a variety of expression analyses. 
Quantitative reverse transcriptase polymerase chain reaction (RT-PCR) using 

30 primers specific to the novel sequences is implemented to evaluate the 
expression of novel HDAC mRNA in a variety of normal cell lines and tissue 
as well as a spectrum of human tumor cell lines. Expression profiles of novel 
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HDACs are confirmed using Northern blot analysis or ribonuclease protection 
assays. 

In addition, tissue arrays containing a variety of patient organ samples 
and arrays of malignant tissue are evaluated by in situ hybridization to gain 

5 further insights into the association of the novel HDAC proteins with particular 
physiological responses and in neoplasia. 
Subcellular Localization 

The subcellular localization of MYC- or FLAG-tagged novel HDAC 
proteins is determined upon ectopic expression in mammalian cells. Cells are 

0 fixed, permeabilized and incubated with anti-MYC or anti-FLAG antibodies to 
detect expressed protein. The localization of tagged proteins is then detected 
using CY3 or FITC-conjugated secondary antibodies and visualized by 
fluorescent microscopy. These studies can determine if the assayed HDACs 
deacetylate nuclear or cytoplasmic protein substrates. 

5 EXAMPLE 3: OLIGONUCLEOTIDES FOR THE ISOLATION OF HDACs 
BMY HDAL1 

Based on the predicted gene structure of BMYJHDAL1, the Primer3 
program designed the following PCR primers and probe oligos for isolation of 
cDNAs. Table 2 presents single exon primers and probes for BMY__HDAL1 

0 cDNA isolation. Table 3 presents multiple exon primers for BMYJHDAL1 
cDNA isolation. Table 4 presents BMY_HDAL1 capture oligonucleotides. As 
shown below in Table 5, a separately designed primer set was used to test for 
BMYJHDAL1 expression using a cDNA pool from human placenta and the 
following human tumor cell lines including Caco-2, LS174-T, MIP, HCT-116, 

5 A2780, OVCAR-3, HL60, A431 , Jurkat, A549, PC3 and LnCAP cells. 
BMY HDAL2 

Based on the predicted gene structure of BMYJHDAL2, the Primer3 
program designed the following PCR primers and probe oligonucleotides for 
isolation of cDNAs. BMY_HDAL2 single exon primers and probes are shown 
) in Table 6. Multiple exon primers for BMY-HDAL2 cDNA isolation are shown 
in Table 7. BMY_HDAL2 capture oligonucleotides are shown in Table 8. As 
shown in Table 9, a separately designed primer set was used to test for 
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BMY_HDAL2 expression using a cDNA pool from human placenta and the 
following human tumor cell lines: Caco-2, LS174-T, MIP, HCT-116, A2780, 
OVCAR-3, HL60, A431, Jurkat, A549, PC3 and LnCAP cells. 
BMY HDAL3 

5 Based on the predicted gene structure of BMY_HDAL3, the Primer3 

program designed the following PCR primers and probe oligonucleotides for 
isolation of cDNAs. For BMYJHDAL3, the following primer sets were 
designed from the AC002410 sequence using Primer3. Single exon primers 
for the novel BMY-HDAL3 isolation are shown in Table 10. Multiple exon 
10 primers for BMYJHDAL3 isolation are presented in Table 11. BMYJHDAL3 
capture oligonucleotides are shown in Table 12. 
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EXAMPLE 4: COMPLEMENTARY POLYNUCLEOTIDES 



Antisense molecules or nucleic acid sequence complementary to an 
HDAC protein-encoding sequence, or any part thereof, can be used to 
decrease or to inhibit the expression of naturally occurring HDAC. Although 
5 the use of antisense or complementary oligonucleotides comprising about 15 
to 35 base-pairs is described, essentially the same procedure is used with 
smaller or larger nucleic acid sequence fragments. An oligonucleotide based 
on the coding sequence of an HDAC polypeptide or peptide, for example, as 
shown in FIG. 1, FIG. 5, FIG. 10, FIGS. 15A-15C, FIGS. 20A-20C, and FIGS. 

10 21A-21B, and as depicted in SEQ ID NO:1, SEQ ID NO:12, SEQ ID NO:19, 
SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96, for example, is used to 
inhibit expression of naturally occurring HDAC. The complementary 
oligonucleotide is typically designed from the most unique 5' sequence and is 
used either to inhibit transcription by preventing promoter binding to the 

15 coding sequence, or to inhibit translation by preventing the ribosome from 
binding to an HDAC protein-encoding transcript. 

Using a portion SEQ ID NO:1, SEQ ID NO:12, SEQ ID NO:19, SEQ ID 
NO:88, SEQ ID NO:94, or SEQ ID NO:96, for example, an effective antisense 
oligonucleotide includes any of about 15-35 nucleotides spanning the region 

20 which translates into the signal or 5' coding sequence of the HDAC 
polypeptide. Appropriate oligonucleotides are designed using OLIGO 4.06 
software and the HDAC coding sequence (e.g., SEQ ID NO:1 , SEQ ID NO:12, 
SEQ ID NO:19, SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96). 

EXAMPLE 5: NORTHERN BLOT ANALYSIS FOR HDACs 

25 Northern Blot analysis is used to detect the presence of a transcript of 

a gene and involves the hybridization of a labeled nucleotide sequence to a 
membrane on which RNA from a particular cell or tissue type has been bound 
(See, J. Sambrook et al., supra). Analogous computer techniques using 
BLAST (S.F. Altschul, 1993, J. MoL EvoL, 36:290-300 and S.F. Altschul et al., 

30 1990, J. MoL EvoL, 215:403-410) are used to search for identical or related 
molecules in nucleotide databases, such as GenBank or the LIFESEQ 
database (Incyte Pharmaceuticals). This analysis is much more rapid and 
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less labor-intensive than performing multiple, membrane-based 
hybridizations. In addition, the sensitivity of the computer search can be 
modified to determine whether any particular match is categorized as being 
exact (identical) or homologous. 
5 The basis of the search is the product score, which is defined as 

follows: (% sequence identity x maximum BLAST score) / 100. The product score 
takes into account both the degree of similarity between two sequences and 
the length of the sequence match. For example, with a product score of 40, 
the match will be exact within a 1-2% error; at 70, the match will be exact. 

10 Homologous molecules are usually identified by selecting those which show 
product scores between 15 and 40, although lower scores may identify related 
molecules. The results of Northern analysis are reported as a list of libraries 
in which the transcript encoding HDAC polypeptides occurs. Abundance and 
percent abundance are also reported. Abundance directly reflects the number 

15 of times that a particular transcript is represented in a cDNA library, and 
percent abundance is abundance divided by the total number of sequences 
that are examined in the cDNA library. 
EXAMPLE 6: MICROARRAYS FOR ANALYSIS OF HDACs 

For the production of oligonucleotides for a microarray, an HDAC 

20 sequence, e.g., a novel HDAC having SEQ ID NO:1, SEQ ID NO:12, SEQ ID 
NO:19, SEQ ID NO:88, SEQ ID NO:94, or SEQ ID NO:96, for example, is 
examined using a computer algorithm which starts at the 3' end of the 
nucleotide sequence. The algorithm identifies oligomers of defined length that 
are unique to the gene, have a GC content within a range that is suitable for 

25 hybridization and lack predicted secondary structure that would interfere with 
hybridization. The algorithm identifies specific oligonucleotides of 20 
nucleotides in length, i.e., 20-mers. A matched set of oligonucleotides is 
created in which one nucleotide in the center of each sequence is altered. 
This process is repeated for each gene in the microarray, and double sets of 

30 20-mers are synthesized in the presence of fluorescent or radioactive 
nucleotides and arranged on the surface of a substrate. When the substrate 
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is a silicon chip, a light-directed chemical process is used for deposition (WO 
95/11995, M. Chee et a!.). 

Alternatively, a chemical coupling procedure and an ink jet device is 
used to synthesize oligomers on the surface of a substrate. (WO 95/25116, 
5 J.D. Baldeschweiler et a!.). As another alternative, a "gridded" array that is 
analogous to a.dot (or slot) blot is used to arrange and link cDNA fragments or 
oligonucleotides to the surface of a substrate using, for example, a vacuum 
system, or thermal, UV, mechanical, or chemical bonding techniques. A 
typical array may be produced by hand, or by using available materials and 

10 equipment, and may contain grids of 8 dots, 24 dots, 96 dots, 384 dots, 1536 
dots, or 6144 dots. After hybridization, the microarray is washed to remove 
any non-hybridized probe, and a detection device is used to determine the 
levels and patterns of radioactivity or fluorescence. The detection device may 
be as simple as X-ray film, or as complicated as a light scanning apparatus. 

15 Scanned fluorescent images are examined to determine degree of 
complementarity and the relative abundance/expression level of each 
oligonucleotide sequence in the microarray. 
EXAMPLE 7: PURIFICATION OF HDAC POLYPEPTIDES 

Naturally occurring or recombinant HDAC polypeptide is substantially 

20 purified by immunoaffinity chromatography using antibodies specific for an 
HDAC polypeptide, or a peptide derived therefrom. An immunoaffinity column 
is constructed by covalently coupling anti-HDAC polypeptide antibody to an 
activated chromatographic resin, such as CNBr-activated SEPHAROSE 
(Amersham Pharmacia Biotech). After the coupling, the resin is blocked and 

25 washed according to the manufacturer's instructions. 

Medium containing HDAC polypeptide is passed over the 
immunoaffinity column, and the column is washed under conditions that allow 
the preferential absorbance of the HDAC polypeptide (e.g., high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions 

30 that disrupt antibody/HDAC polypeptide binding (e.g., a buffer of pH 2-3, or a 
high concentration of a chaotrope, such as urea or thiocyanate ion), and 
HDAC polypeptide is collected. 



121 



WO 02/102323 



PCT/US02/19560 



EXAMPLE 8: IDENTIFICATION OF MOLECULES THAT INTERACT WITH 
HP AC POLYPEPTIDES 

HDAC polypeptides, or biologically active fragments thereof, are 
labeled with 125 l BoltorvHunter reagent (Bolton et al., 1973, Biochem. J., 
5 133:529). Candidate molecules previously arrayed in wells of a multi-welled 
plate are incubated with the labeled HDAC polypeptide, washed, and any 
wells having labeled HDAC polypeptide-candidate molecule complexes are 
assayed. Data obtained using different concentrations of HDAC polypeptide 
are used to calculate values for the number, affinity and association of an 
0 HDAC polypeptide with the candidate molecules. 

Another method suitable for identifying proteins, peptides or other 
molecules that interact with an HDAC polypeptide include ligand binding 
assays such as the yeast-two hybrid system as described hereinabove. 
EXAMPLE 9: IDENTIFICATION AND CLONING OF HDAC9C 
5 Bioinformatic searches of the assembled human genome sequence 

were performed using a conserved consensus sequence derived from the 
catalytic domain of class I and class II HDACs. Three gene fragments 
(HDAL1 , HDAL2, HDAL3) were identified from the assembled sequence of 
human chromosome 7q36 that encoded amino acids sequence with homology 
0 to class II HDACs. Biotinylated single stranded oligonucleotides representing 
unique sequences from these predicted gene fragments of the following 
sequence were prepared: 

HDAL1 , 5-gtttcttgcagtcgtgaccagatactctgattcgtccagcatgctcagggt 
gggtgggtggaattgccacaaacgca (SEQ ID NO:101); 
5 HDAL2, 5'-tgccagggaaaaagt tcccttcatcatagcgatggagtgaaatgtaca 

ggatgctggggtcagcataaaaggcctgctgg (SEQ ID NO:102); and 
HDAL3, 5' tgatccagacatggtcttagtatctgctggatttgatgcattggaaggcca 
cacccctcctctaggagggtacaaagtga (SEQ ID NO: 103). 
The biotinylated oligonucleotides were hybridized to fractions of cDNA 
) prepared from human placenta, and positive sequences were identified by 
PCR. Three of the clones identified (HDACX1A, HDACX2A, and HDACX3A) 
contained overlapping cDNAs that showed sequence identity to the predicted 
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gene fragments. These cDNAs encoded a novel sequence, designated 
HDAC9c (FIGS. 15A-15C), that shared homology to class II HDACs. A full 
length HDAC9c construct was prepared by combining a 1 .3 kb BamHI-Psfl 
fragment from the HDACX2A clone with a 3.5 kb Psfi-Noti fragment from the 
5 HDACX3A. These fragments were ligated into mammalian expression 
vectors pcDNA3.1 and pcDNA4.0. The resulting constructs were evaluated 
by DNA sequencing to confirm the identity of the inserts. The HDAC9c 
pcDNA3.1 construct was deposited at the American Type Culture Collection 
(ATCC), 10801 University Boulevard, Manassas, VA 20110-2209 on June 12, 

10 2002 under ATCC Accession No. according to the 

terms of the Budapest Treaty. 

Three fragments that encoded homology to class II HDACs were 
identified from the assembled sequence of human chromosome 7q36. 
Subsequent cDNA cloning bioinformatics analysis revealed that these gene 

15 fragments encoded a single class II HDAC, comprising a protein of 1147 
amino acids. This sequence was provisionally designated as HDAC-9, and 
later renamed HDAC9c. During the course of this work, similar sequences 
were reported by Zhou et al. (2001, Proa Natl. Acad ScL USA 98:10572-7), 
including two isoforms related to class II HDAC proteins. Sequence 

20 alignments revealed the HDAC-9 sequence was closely related to the 
previously identified HDAC9 sequences (GenBank Accession Nos. AY032737 
and AY032738). However, the published sequences lacked a large portion of 
the C-terminal domain common to known class HDAC proteins (FIGS. 15D- 
15F). 

25 One of the HDAC9 isoforms (HDAC9a, (GenBank Accession No. 

AY032737) lacked ~ 185 C-terminal amino acids compared to other HDAC 
family members. Another isoform of HDAC9 (HDAC9, (GenBank Accession 
No. AY032738) lacked approximately 65 C-terminal amino acids compared to 
other HDAC family members. In contrast to these sequences, the HDAC9c 

30 sequence, also designated as HDAC-X, contained more than 50 additional 
amino acids at its C-terminus (FIGS. 15D-15F). The HDAC9c sequence was 
deemed to represent the full-length version of HDAC9. Notably, HDAC9c 
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contained an LQQ sequence motif at positions 123-125. This motif was 
missing in the HDAC9 C-terminal truncated isoforms, but was conserved in 
other HDAC family members. Thus, the LQQ sequence motif may be 
important for the function of the HDAC9c protein. No. other motifs were 
5 identified by PFAM analysis (A. Bateman et al., 2002, Nucl. Acids Res 
30:276-80). 

EXAMPLE 10: EXPRESSION PROFILING FOR HDAC9 

To determine the distribution of HDAC9 in adult normal tissues, the 
expression profile of HDAC9 was examined by Northern blot analysis. 

0 Northern blotting was performed as described (Sambrook et al., Molecular 
Cloning: A Laboratory Manual, 2 nd Edition). Tissue samples were obtained 
from CLONTECH (Palo Alto, CA). The probe for Northern blotting was 
derived from nucleotides 2917-3211 of HDAC9c (FIG. 16D; SEQ ID NO:92). 
Two > 8.0 kb HDAC9 transcripts were detected at low levels in brain, skeletal 

5 muscle, stomach, and trachea tissue (FIG. 16A). Upon longer exposure, 
HDAC9 mRNA was also detected in mammary gland and prostate tissue 
(FIG. 16A). 

Given the low level of expression in normal tissues, experiments were 
performed to determine the expression of HDAC9 in human tumor cell lines. 

0 HDAC9 mRNA expression levels were evaluated by quantitative PCR 
analysis on first-strand cDNA prepared from a variety of human tumor cell 
lines (ATCC, Rockville, MD). HDAC9 levels were normalized to GAPDH 
mRNA levels within the samples, and RNA levels were quantified using the 
fluorophore SYBR green. For amplification, HDAC9 primers were used: 

5 forward primer S'-gtgacaccatttggaatgagctac (SEQ ID NO:104); and reverse 
primer 5'ttggaagccagctcgatgac (SEQ ID NO:105). HDAC9 expression was 
found to be elevated in ovarian, breast, and certain lung cancer cell lines 
(FIG. 16B). In contrast, HDAC9 was poorly expressed in tumor cell lines 
derived from colon tumor specimens (FIG. 16B). 

0 To confirm these results, nuclease protection experiments were 

performed on RNAs isolated from select tumor cell displaying a range of 
HDAC9 expression. Nuclease protection was performed using 35 S-labeled 
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UTP as a radioactive precursor for a in accordance with published methods 
(Sambrook et al., Molecular Cloning: A Laboratory Manual, 2 nd Edition). The 
riboprobe sequence was derived from nucleotides 2917-3211 in HDAC9c 
(FIG. 16D; SEQ ID NO:92). Brain tissue was included as a control to show 
5 normal tissue expression levels. The profile of HDAC9 expression observed 
by quantitative RT-PCR was confirmed by nuclease protection (i.e., A2780 > 
MDA-MB453 > MCF7; FIG. 16C). The pervasive expression of HDAC9 in 
tumor cell lines of diverse origin, and the low level expression of HDAC9 in 
normal adult tissue, suggested that the expression of this gene was regulated 
1 0 in tumor progression. 

EXAMPLE 11: IN SITU HYBRIDIZATION TO ANALYZE HDAC9 
EXPRESSION 

To further analyze the upregulation of HDAC9 in tumor cells, a variety 
of human tumor and normal tissue specimens were subjected to in situ 

15 hybridization using an HDAC9 antisense riboprobe and tissue microarrays. A 
35 S-labeled cRNA riboprobe was prepared from a 295 bp cDNA fragment from 
the HDAC9 coding region (FIG. 16D; SEQ ID NO:92). This fragment encoded 
the most divergent region of the HDAC9 protein. The riboprobe was 
hybridized to paraffin-embedded clinical tissue specimens derived from 

20 normal or cancerous tissues, and processed by standard procedures (Lorenzi 
et al., 1999, Oncogene 18:4742-4755). Hybridized sections were incubated 
for 3 to 6 weeks, and the level and localization of HDAC9 staining was 
evaluated by microscopy. Staining levels were quantified by a board-certified 
pathologist. 

25 HDAC9 mRNA levels were generally below the limit of detection 

(staining level = 0) in normal tissues, including breast, kidney, testis, and liver 
tissues. Low to moderate levels of HDAC9 mRNA (staining level = 1-2) were 
detected in lymph node, brain, adrenal gland, pancreas, bladder, lung, and 
gastric tissues (data not shown). Normal breast and prostate tissue showed 

30 average staining levels of 0 and 1, respectively (FIGS. 17A-17C). A dramatic 
increase in HDAC9 mRNA expression was detected in breast tumor (average 
staining level = 2-3) and prostate tumor (average staining level = 2) tissues 
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(FIGS. 17A-17C). Preliminary data also showed increased expression of 
HDAC9 in endometrial and ovarian tumors. Thus, HDAC9 was expressed at 
very low levels in normal adult peripheral tissues, but was overexpressed in a 
variety of tumors, including breast and prostate adenocarcinomas. This 
5 suggested that HDAC9 .expression correlated with the progression of breast 
and prostate tumors. 

EXAMPLE 12: EFFECT OF HPAC9c ON CELLULAR TRANSFORMATION 

Results of the experiments, above, indicated that elevated HDAC9c 
expression was associated with certain tumor cells. To further investigate its 

10 involvement in tumorogenesis, HDAC9c was evaluated for its ability to 
morphologically transform mouse fibroblasts. HDAC9c in pcDNA3.1 was 
introduced by calcium phosphate transfection into 1.5 x 10 6 NIH/3T3 cells 
(ATCC, Rockville, MD) in duplicate at 1.0 |ig/10 cm plate. One set of cultures 
received growth medium (DMEM containing 5% calf serum) while the parallel 

15 culture received growth medium containing 750 jLig/ml of G418 to develop 
stable clonal populations. 

After 10-14 days in culture, unselected plates were stained with 
Geimsa (Sigma-Aldrich, St. Louis, MO), and morphologically transformed foci 
were visualized. Selected clones were examined for growth in soft agar at 

20 10 5 , 10 4 , or 10 3 cells/15 mm well following standard protocols. After 2-3 
weeks in culture, colonies were visualized by microscopy and tetrazolium 
violet staining. HDAC9c transfectants produced some foci in monolayer 
culture (data not shown). However, the response was not robust, suggesting 
that higher levels HDAC9c expression levels were required to transform 

25 NIH/3T3 cells. 

HDAC9c transfectants were also evaluated for anchorage-independent 
growth. NIH/3T3 cells stably transfected with HDAC9c or FGF8 constructs, or 
vector alone, were suspended in soft agar containing growth medium and 
cultured for 2-3 weeks. FGF8 is a cDNA that potently transforms NIH/3T3 

30 through autocrine stimulation of endogenous FGF receptors (Lorenzi et al., 
1995, Oncogene 10:2051-2055). In vector transfectants, very few colonies 
greater than 50 |im in diameter were observed after three weeks in culture 
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(FIG. 18). In contrast, FGF8 transfectants produced several colonies greater 
than 50 jaDm after three weeks (FIG. 18). HDAC9c transfectants also 
produced significant colony growth compared to vector transfectants, but less 
than that observed for FGF8 transfectants (FIG. 18). These results suggested 
5 that overexpression of HDAC9c induced an oncogenic phenotype in mouse 
fibroblasts. 

EXAMPLE 13: EFFECT OF HDAC9c ON THE ACTIN CYTOSKELETON 

Changes in the actin cytoskeleton often accompany the transformed 
phenotype of cells expressing oncogenes such as Ras, Rho, or src. In 

10 general, gene products that affect cell adhesion or motility are associated with 
changes in the actin cytoskeleton. To investigate whether the transformation 
induced by HDAC9c was associated with changes in the cytoskeletal 
architecture, NIH/3T3 transfectants expressing HDAC9c were subjected to 
fluorescent staining with TRITC-conjugated phalloidin to visualize filamentous 

15 actin (F-actin). 

In these experiments, a HDAC4 construct was used as a control. For 
the control construct, full-length HDAC4 cDNA was amplified by RT-PCR from 
first-strand cDNA based on the sequence reported by Grozinger et al. [Proc. 
Natl. Acad. ScL USA 96:4868-4873), and cloned into pcDNA3.1. Mass- 

20 selected stable NIH/3T3 clones of HDAC9c (in pcDNA3.1), Ras, HDAC4, or 
vector alone, were plated in 8 well chamber slides in duplicate and allowed to 
adhere overnight in growth medium (DMEM high glucose containing 10% calf 
serum). Cells were subsequently serum-starved for 18 hours and one set 
was stimulated with 10% calf serum for 15 minutes. The cultures were fixed 

25 for 30 minutes in 4% paraformaldehyde, permeabilized in 0.02% Triton-X100, 
and incubated with TRITC or FITC conjugated phalloidin (Sigma, St. Louis, 
MO) for 2 hours. Filamentous actin was visualized by fluorescence 
microscopy, and images were captured with a digital camera. 

In parental NIH/3T3 cells (data not shown) or vector transfectants, low 

30 levels of F-actin stress fiber formation were observed following serum 
starvation for 18 hours (FIG. 19). Stimulation of these cells for 15 minutes 
with serum promoted an extensive stress fiber network (FIG. 19), indicating 
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that the extracellular signals regulating these pathways were intact in these 
cells. A dramatic increase in stress fiber content and organization was 
observed in serum starved HDAC9c-expressing cells (FIG. 19), indicating that 
that expression of HDAC9c was sufficient to induce reorganization of the actin 
5 cytoskeleton. In contrast, no stress fiber formation was observed in serum 
starved NIH/3T3 cells expressing the HDAC4 protein (FIG. 19). These results 
suggested that induction of actin stress fiber formation underlay the 
transformed phenotype associated with expression of HDAC9c. 
Conclusion 

10 Inhibitors of HDAC activity are involved in the regulation of cellular 

proliferation, apoptosis, and differentiation of a variety of cell types. However, 
little is known about the role of individual HDACs in tumor cells or in their 
genesis. In accordance with the present invention, a unique HDAC isoform, 
HDAC9c, has been identified and characterized. HDAC9 shows restricted 

15 expression in normal adult tissues, but is overexpressed in several primary 
human tumors, including those derived from breast and prostate cancers. 
The overexpression of HDAC9c in in vitro models promoted the oncogenic 
transformation of fibroblasts and this transformed phenotype was associated 
with the induction of actin cytoskeletal stress fiber formation. These results 

20 suggest a functional consequence of HDAC9c overexpression is the 
promotion and/or maintenance of the transformation state of certain tumor 
cells. 

Members of the HDAC protein family have been shown to possess 
potent ability to repress transcription. For instance, tumor suppressor genes 

25 p21 and gelsolin are expressed upon HDAC inhibition (Sowa et al., 1999, 
Cancer Res. 59(17):4266-70; Saito et al., 1999, Proc. Natl. Acad ScL USA 
96:4592-4597). It is interesting to note that gelsolin negatively regulates the 
formation of the actin cytoskeleton (Sun et al., 1999, J. Biol. Chem. 
274:33179-33182). In contrast, actin cytoskeleton formation is positively 

30 regulated by HDAC9c expression (FIG. 19). Thus, HDAC9c inhibition or 
overexpression may regulate gelsolin levels, and this regulation may underlie 
the cytoskeletal changes mediated by HDAC9c. 
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HDAC9 was overexpressed greater than 90% of the breast and 
prostate tumor specimens examined compared to corresponding tissue from 
normal patients (FIGS. 17A-17B). By comparison, the epidermal growth 
factor (EGF) receptor, erbB2, has been estimated to be overexpressed in 
5 roughly 30% of certain tumor types (King et al., 1985, Science 229:974-976). 
These observations strongly suggest that HDAC9c can be used as a 
diagnostic marker for breast or prostate tumorigenesis. Hormonal signaling is 
critical to the progression and treatment of breast cancers, and HDAC9 has 
been implicated in transcription (Zhou et al., Proc. Natl. Acad ScL USA 

10 98:10572-10577). Without wishing to be bound by theory, it is possible that 
HDAC9 regulates estrogen or androgen responsive promoters in these tumor 
cells. As shown herein, HDAC9 expression is increased in primary cancers, 
and restricted in normal tissue expression. Further, HDAC9c expression 
induces oncogenic transformation. The sum of these observations indicates 

15 that HDAC9c can be used as a diagnostic and/or therapeutic target for certain 
tumors or cancers, in particular, breast and prostate tumors or cancers. 
EXAMPLE 14: HDAC9 SPLICE VARIANTS 

Using the methods described herein, HDAC9 splice variants were 
identified, including BMYJHDACX variant 1 (FIGS. 20A-20C; SEQ ID NO:94; 

20 also called BMY_HDACX_v1 and HDACX_v1) and BMYJHDACX variant 2 
(FIGS. 21A-21B; SEQ ID NO:96; also called BMYJHDACX_v2 and 
HDACX_v2). The cDNA sequences for BMYJHDACX_y1 (SEQ ID NO:94) 
and BMY_HDACX_v2 (SEQ ID NO:96) were aligned to the nucleotide 
sequences of three reported splice products of the HDAC9 gene, including 

25 HDAC9V1 (NCBI Ref. Seq. NM_058176; FIGS. 22A-22C; SEQ ID NO:97), 
HDAC9V2 (NCBI Ref. Seq. NM_058177; FIGS. 22D-22F; SEQ ID NO:98), 
and HDAC9V3 (NCBI Ref. Seq. NM_014707; FIGS. 22G-22I; SEQ ID 
NO:100). The sequence alignment produced by ClustalW (D.G. Higgins et 
al., 1996, Methods EnzymoL 266:383-402) is shown in FIGS. 23A-23K. 

30 ClustalW sequence alignments indicated that the HDAC9c amino acid 

sequence showed 80.5% identity to the HDAC9a (AY032738) amino acid 
sequence, 94.1% identity to the HDAC9 (AY032737) amino acid sequence, 



129 



WO 02/102323 



PCT/US02/19560 



and 55.1% identity to the HDAC5 (AF1 32608) amino acid sequence. The 
HDAC9c nucleotide sequence showed 81.4% identity to the HDAC9a 
(AY032738) nucleotide sequence, 94.3% identity to the HDAC9 (AY032737) 
nucleotide sequence, and 60.1% identity to the HDAC5 (AF1 32608) 
5 nucleotide sequence. In addition, the HDACX_v2 amino acid sequence 
showed 55.2% identity to the most closely related amino acid sequence, and 
the HDACX_v2 nucleotide sequence showed 55.3% identity to the HDAC9a 
(AY032738) nucleotide sequence, 48.1% identity to the HDAC9 (AY032737) 
nucleotide sequence, and 27.6% identity to the HDAC5 (AF1 32608) 

1 0 nucleotide sequence. 

Additional amino acid sequence alignments are shown in FIGS. 24A- 
24D and FIGS. 25A-25C. For reference, the SEQ ID NOs of the sequences 
of the present invention are listed in the table shown below. HDACX_v1 and 
HDACX_v2 constructs were deposited at the American Type Culture 

15 Collection (ATCC), 10801 University Boulevard, Manassas, VA 20110-2209 

on under ATCC Accession No. 

according to the terms of the Budapest Treaty. 



Description 


SEQ ID NO: 


BMY_HDAL1 nucleic acid sequence 


SEQ ID NO:1 


BMY_HDAL1 amino acid sequence 


SEQ ID NO:2 


BMY_HDAL1 reverse nucleic acid sequence 


SEQ ID NO:3 


BMY_HDAL2 amino acid sequence 


SEQ ID NO:4 


BMY_HDAL3 amino acid sequence 


SEQ ID NO:5 


SCJHDA1 amino acid sequence 


SEQ ID NO:6 


Human HDAC4 amino acid sequence 


SEQ ID NO:7 


Human HDAC5 amino acid sequence 


SEQ ID NO:8 


Human HDAC7 amino acid sequence 


SEQ ID NO:9 


Aquifex ACUC HDAL amino acid sequence 


SEQIDNO:10 


AC002088 nucleic acid sequence 


SEQ ID NO:11 


BMY_HDAL2 nucleic acid sequence 


SEQIDNO:12 


BMY_HDAL2 reverse nucleic acid sequence 


SEQ ID NO:13 


AC002410 nucleic acid sequence 


SEQ ID NO:14 



130 



WO 02/102323 



PCT/US02/19560 



Description 


SEQ ID NO: 


N-terminus of BMY_HDAL3 


SEQID NO: 15 


C-terminus of BMY_HDAL3 


SEQ ID NO:16 


BAC AC004994 nucleic acid sequence 


SEQ ID NO:17 


BAC AC004744 nucleic acid sequence 


SEQIDNO:18 


BMY_HDAL3 nucleic acid sequence 


SEQID NO:19 


BMY_HDAL3 reverse strand nucleic acid sequence 


SEQIDNO:20 


AAC7861 8 amino acid sequence 


SEQ ID NO:21 


AAD15364 amino acid sequence 


SEQ ID NO:22 


AA287983 nucleic acid sequence 


SEQ ID NO:23 


BMY_HDAL1 single exon primer 


SEQ ID NO:24 


BMY_HDAL1 single exon primer 


SEQ ID NO:25 


BMY_HDAL1 single exon primer 


SEQ ID NO:26 


BMY_HDAL1 single exon primer 


SEQ ID NO:27 


BMYJHDAL1 multiple exon primer 


SEQ ID NO:28 


BMY_HDAL1 multiple exon primer 


SEQIDNO:29 


BMYJHDAL1 multiple exon primer 


SEQ ID NO:30 


BMY_HDAL1 multiple exon primer 


SEQ ID NO:31 


BMY_HDAL1 multiple exon primer 


SEQID NO:32 


BMY_HDAL1 multiple exon primer 


SEQ ID NO:33 


BMY_HDAL1 multiple exon primer 


SEQ ID NO:34 


BMY_HDAL1 multiple exon primer 


SEQ ID NO-.35 


BMY_HDAL1 capture oligonucleotide 


SEQIDNO.-36 


BMY_HDAL1 5' oligo primer 


SEQ ID NO:37 


BMY_HDAL1 3' oligo primer 


SEQ ID NO:38 


BMY_HDAL2 single exon primer 


SEQ ID NO:39 


BMYJHDAL2 single exon primer 


SEQ ID NO:40 


BMY_HDAL2 single exon primer 


SEQ ID NO:41 


BMY_HDAL2 single exon primer 


SEQ ID NO:42 


BMY_HDAL2 single exon primer 


SEQ ID NO:43 


BMYJHDAL2 single exon primer 


SEQ ID NO:44 


BMY_HDAL2 single exon primer 


SEQ ID NO:45 


BMY_HDAL2 single exon primer 


SEQ ID NO:46 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:47 
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Description 


SEQ ID NO: 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:48 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:49 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:50 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:51 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:52 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:53 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:54 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:55 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:56 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:57 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:58 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:59 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:60 


BMYJHDAL2 multiple exon primer 


SEQ ID NO:61 


BMY_HDAL2 multiple exon primer 


SEQ ID NO:62 


BMY_HDAL2 capture oligonucleotide 


SEQ ID NO:63 


BMYJHDAL2 capture oligonucleotide 


SEQ ID NO:64 


BMY_HDAL2 5' oligo primer 


SEQ ID NO:65 


BMYJHDAL2 3' oligo primer 


SEQ ID NO:66 


BMY_HDAL3 single exon primer 


SEQ ID NO:67 


BMYJHDAL3 single exon primer 


SEQ ID NO:68 


BMYJHDAL3 single exon primer 


SEQ ID NO:69 


BMY_HDAL3 single exon primer 


SEQ ID NO:70 


BMY_HDAL3 single exon primer 


SEQ ID NO:71 


BMYJHDAL3 single exon primer 


SEQ ID NO:72 


BMY_HDAL3 single exon primer 


SEQ ID NO:73 


BMY_HDAL3 single exon primer 


SEQ ID NO:74 


BMYJHDAL3 multiple exon primer 


SEQ ID NO:75 


BMY_HDAL3 multiple exon primer 


SEQ ID NO:76 


BMY_HDAL3 multiple exon primer 


SEQ ID NO:77 


BMY_HDAL3 multiple exon primer 


SEQ ID NO:78 


BMYJHDAL3 multiple exon primer 


SEQ ID NO:79 


BMYJHDAL3 multiple exon primer 


SEQ ID NO:80 
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Description 


SEQ ID NO: 


BMY_HDAL3 multiple exon primer 


SEQ ID NO:81 


BMY_HDAL3 multiple exon primer 


SEQ ID NO:82 


BMY_HDAL3 capture oligo 


SEQ ID NO:83 


BMY_HDAL3 capture oligo 


SEQ ID NO:84 


BMY_HDAL3 capture oligo 


SEQ ID N0.85 


BMY_HDAL3 capture oligo 


SEQ ID NO:86 


HDAC9c amino acid sequence 


SEQIDNO:87 


HDAC9c nucleotide sequence 


SEQ ID NO:88 


HDAC9 (AY032737) amino acid sequence 


SEQ ID NO:89 


HDAC9a (AY032738) amino acid sequence 


SEQ ID NO:90 


HDAC4 (ALF1 32608) amino acid sequence 


SEQ ID NO:91 


HDAC9 probe 


SEQ ID NO:92 


BMY_HDACX_v1 amino acid sequence 


SEQ ID NO:93 


BMY_HDACX_v1 nucleotide sequence 


SEQ ID NO:94 


BMY_HDACX_v2 amino acid sequence 


SEQ ID NO:95 


BMY_HDACX_v2 nucleotide sequence 


SEQ ID NO:96 


HDAC9v1 (NM_058176) amino acid sequence 


SEQ ID NO:89 


HDAC9v1 (NM_058176) nucleotide sequence 


SEQ ID NO:97 


HDAC9v2 (NM_058177) amino acid sequence 


SEQ ID NO:90 


HDAC9v2 (NM_058177) nucleotide sequence 


SEQ ID N0.98 


HDAC9v3 (NM_014707) amino acid sequence 


SEQ ID NO:99 


HDAC9v3 (NM_014707) nucleotide sequence 


SEQ ID NO:100 


HDAL1 primer 


SEQ ID NO: 101 


HDAL2 primer 


SEQ IDNO:102 


HDAL3 primer 


SEQ ID NO:103 


HDAC9 forward primer 


SEQIDNO:104 


HDAC9 reverse primer 


SEQ IDNO:105 


HDAC consensus nucleotide sequence 


SEQ ID NO: 106 


HDAC consensus amino acid sequence 


SEQ ID NO:107 



The contents of all patents, patent applications, published PCT 
applications and articles, books, references, reference manuals and abstracts 
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cited herein are hereby incorporated by reference in their entirety to more fully 
describe the state of the art to which the invention pertains. 

As various changes can be made in the above-described subject 
5 matter without departing from the scope and spirit of the present invention, it 
is intended that all subject matter contained in the above description, or 
defined in the appended claims, be interpreted as descriptive and illustrative 
of the present invention. Many modifications and variations of the present 
invention are possible in light of the above teachings. 

10 



134 



WO 02/102323 PCT/US02/19560 

WHAT IS CLAIMED IS : 

1. An isolated polynucleotide encoding a histone deacetylase 
polypeptide which consists of an amino acid sequence selected from the 
group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 

5 NO:87, SEQ ID NO:93, and SEQ ID NO:95. 

2. An isolated polynucleotide consisting of a nucleotide sequence 
selected from the group consisting of SEQ ID NO:1, SEQ ID NO:12, SEQ ID 
NO: 19, SEQ ID NO:88, SEQ ID NO:94, and SEQ ID NO:96. 

3. An primer consisting of a nucleotide sequence selected from the 
10 group consisting of SEQ ID NO:24-27, SEQ ID NO:28-35, SEQ ID NO:39-46, 

SEQ ID NO:47-62, SEQ ID NO:65-66, SEQ ID NO:67-74, SEQ ID NO:75-82, 
and SEQ ID NO:104-105. 

4. A probe consisting of a nucleotide sequence selected from the 
group consisting of SEQ ID NO:36, SEQ ID NO:63-64, SEQ ID NO:83-86, 

15 SEQ ID N092, and SEQ ID NO:101-103. 

5. A cell line comprising the isolated polynucleotide according to 
claim 1 . 

6. An expression vector comprising the isolated polynucleotide 
according to claim 1 . 

20 7. A host cell comprising the expression vector according to claim 

6, wherein the host cell is selected from the group consisting of bacterial, 
yeast, insect, mammalian, and human cells. 

8. An isolated polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID 

25 NO:5, SEQ ID NO:87, SEQ ID NO:93, and SEQ ID NO:95. 

9. An antibody which binds specifically to the isolated polypeptide 
according to claim 8, wherein the antibody is selected from the group 
consisting of polyclonal and monoclonal antibodies. 



135 



WO 02/102323 



PCT/US02/19560 



10. An antisense polynucleotide which consists of a nucleotide 
sequence selected from the group consisting of SEQ ID NO:36, SEQ ID 
NO:63-64, and SEQ ID NO:83-86. 

11. An expression vector comprising the antisense polynucleotide 
5 according to claim 10. 

12. A pharmaceutical composition selected from the group 
consisting of: 

a. a pharmaceutical composition comprising a monoclonal 
antibody that specifically binds to an isolated polypeptide consisting of an 

10 amino acid sequence selected from the group consisting of SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID NO:93, and SEQ ID 
NO:95, and a physiologically acceptable carrier, diluent, or excipient; 

b. a pharmaceutical composition comprising an antisense 
polynucleotide which consists of a nucleotide sequence selected from the 

15 group consisting of SEQ ID NO:36, SEQ ID NO:63-64, and SEQ ID NO:83-86, 
and a physiologically acceptable carrier, diluent, or excipient; and 

c. a pharmaceutical composition comprising an expression vector 
comprising an isolated polynucleotide encoding a histone deacetylase 
polypeptide which consists of an amino acid sequence selected from the 

20 group of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:87, SEQ ID 
NO:93, and SEQ ID NO:95, and a physiologically acceptable carrier, diluent, 
or excipient. 

13. A method of treating a cancer selected from the group 
consisting of breast and prostate cancer comprising administering the 

25 pharmaceutical composition according to claim 12 in an amount effective for 
treating the cancer. 
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14. A method of diagnosing a cancer selected from the group 
consisting of breast and prostate cancer comprising: 

a. incubating the primer according to claim 3 with a 
biological sample under conditions to allow the primer to amplify a 

5 polynucleotide in the sample to produce a amplification product; and 

b. measuring levels of amplification product formed in (a), 
wherein an alteration in these levels compared to standard levels indicates 
diagnosis of the cancer. 

15. A method of diagnosing a cancer selected from the group 
1 0 consisting of breast and prostate cancer comprising: 

a. incubating the probe according to claim 4 with a biological 
sample under conditions to allow the probe to hybridize with a polynucleotide 
in the sample to form a complex; and b. 
measuring levels of hybridization complex formed in (a), wherein an 
15 alteration in these levels compared to standard levels indicates diagnosis of 
the cancer. 

16. A method of diagnosing a cancer selected from the group 
consisting of breast and prostate cancer comprising: 

a. contacting the antibody according to claim 9 with a 
20 biological sample under conditions to allow the antibody to associate with a 

polypeptide in the sample to form a complex; and 

b. measuring levels of complex formed in (a), wherein an 
alteration in these levels compared to standard levels indicates diagnosis of 
the cancer. 

25 17. A method of detecting a histone deacetylase polynucleotide 

comprising: 

a. incubating the probe according to claim 4 with a biological 
sample under conditions to allow the probe to hybridize with a polynucleotide 
in the sample to form a complex; and b. 
30 identifying the complex formed in (a), wherein identification of the 

complex indicates detection of a histone deacetylase polynucleotide. 
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18. A method of detecting a histone deacetylase polypeptide 
comprising: 

a. incubating the antibody according to claim 9 with a 
biological sample under conditions to allow the antibody to associate with a 

5 polypeptide in the sample to form a complex; and 

b. identifying the complex formed in (a), wherein 
identification of the complex indicates detection of a histone deacetylase 
polypeptide. 

19. A method of screening test agents to identify a candidate 
1 0 bioactive agent comprising: 

a. contacting the isolated polynucleotide according to claim 
1 with test agents under conditions to allow a test agent to associate with the 
polynucleotide to form a complex; b. 

detecting the complex of (b), wherein detection of the complex 
15 indicates identification of a candidate bioactive agent. 

20. A method of screening test agents to identify a candidate 
bioactive agent comprising: 

a. contacting the isolated polypeptide according to claim 8 
with test agents under conditions to allow a test agent to associate with the 

20 polypeptide to form a complex; 

b. detecting the complex of (b), wherein detection of the 
complex indicates identification a candidate bioactive agent. 
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1/66 



Glyl leAlaTyrAspProLeuMetLeuLysHi sGlriCysValCysGly 
1 ggaattgcctatgaccccttgatgctgaaacaccagtgcgtttgtggc 
ccttaacggatactggggaactacgacfcttgtggtcacgcaaacaccg 

AsnSerThrThrHisProGluHisAlaGlyArglleGlnSerlleTrp 
49 aattccaccacccaccctgagcatgctggacgaatacagagtatctgg 
ttaaggtggtgggtgggactcgtacgacctgcttatgtctcatagacc 

SerArgLeuGlnGluThrGlyLeuLeuAsnliysCysGluArglleGln 
97 tcacgactgcaagaaactgggctgctaaataaatgtgagcgaattcaa 
agtgctgacgttctttgacccgacgatttatttacactcgcttaagtt 

GlyArgLysAlaSerLeuGluGluIleGlnLeuValHisSerGluHis 
145 ggtcgaaaagccagcctggaggaaatacagcttgttcattctgaacat 
ccagcttttcggtcggacctcctttatgtcgaacaagtaagacttgta 

HisSerLeuLeuTyrGlyThrAsnProLeuAspGlyGlnLysLeviAsp 
193 cactcactgttgtatggcaccaaccccctggacggacagaagctggac 
gtgagtgacaacataccgtggttgggggacctgcctgtcttcgacctg 

ProArglleLeuLeuGlyAspAspSerGlnLysPhePheSerSerLeu 
241 cccaggatactcctaggtgatgactctcaaaagtttttttcctcatta 
gggtcctatgaggatccactactgagagttttcaaaaaaaggagtaat 

ProCy sGlyGlyLeuGlyVal SerThr 
289 ccttgtggtggacttggggtaagtaca 
ggaacaccacctgaaccccattcatgt 
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Genewise results from HDA5_HlIMAN__run2 applied to AC002088 
Hit 1: bits = 149 

BAC start: 56543 

BAC end: 74703 

Protein start: 684 

Protein end: 788 

>Results for GCGPROT : HDA5_HUMAN vs AC002088 (forward) [0] 
genewisedb output 

Score 149.09 bits over entire alignment. 

This will be different from per-alignment scores. See manual for details 
For computer parsable output, try genewisedb -help or read the manual 
Scores as bits over a synchronous coding model 

Alignment 1 Score 148.82 (Bits) 



HDA5 684 GVVYDTFMLKHQCMCGNTHV 

G + YD +MLKHQC4-CGN + 

GIAYDPLMLKHQCVCGNSTT 

AC002088 56543 ggaattgcctatgaccccttgatgctgaaacaccagtgcgtttgtggcaattccaccacc 

HPEHAGRIQSIWSRLQETG 
HPEHAGRIQSIWSRLQETG 
HPEHAGRIQSIWSRLQETG 
caccctgagcatgctggacgaatacagagtatctggtcacgactgcaagaaactggg 

HDA5 723 LLSKCE RIRGRK 

LL + KCE RI+GRK 
LLNKCE RIQGRK 

AC 002088 56660 ctgctaaataaatgtgagGTAATCC Intron 1 CAGcgaattcaaggtcgaaaa 

< 0 [56678:69695]-0> 

A- T L D 
A + L + 
A S L E 
gccagcctggag 

HDA5 739 EIQTVHSEYHTLLYGTSPLN 

EIQ VHSE + H + LLYGT + PL + 

EIQLVHSEHHSLLYGTNPLD 

AC002088 69726 gaaatacagcttgttcattctgaacatcactcactgttgtatggcaccaaccccctggac 

RQKLDSKKLL 
Q K L D + Li L 

GQKLDPRILL 
ggacagaagctggaccccaggatactccta 

HDA5 769 PISQKMYAVLP 

SQK + + + LP 
G:G[ggt] DDSQKFFSSLP 
AC002088 69816 GGTCTGTA Intron 2 TAGGTgatgactctcaaaagtttttttcctcattacct 

<1 [69817:74644]-1> 
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CGGIGVDS 
C G G + G V + 
CGGLGVST 
tgtggtggacttggggtaagtaca 

HDA5 783 G I G V D S 

G + G V + 

G L G V S T 
AC 002088 74686 ggacttggggtaagtaca 
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MOTIFS FROM: BMY_HDAL1 . AA . FASTA 
MISMATCHES: 0 

BMY__HDALl . AA . FASTA CHECK: 4620 LENGTH: 105 ! 

AMIDATION XG(R,K) (R,K) 

XG(R) (K) 
48: KCERI . QGRK ASLEE 

(ABSTRACT FILE: 0009. PDOC) 

ASN_GLYCOSYL AT ION N- (P) (S,T) ~ (P) 

N~P(T)~P 
17: QCVCG NSTT HPEHA 

(ABSTRACT FILE: 0001. PDOC) 

CAMP_PHOSPHO_SITE (R,K) 2X (S,T) 

(R,K) {2}X(S> 
50: ERIQG RKAS LEEIQ 

. (ABSTRACT FILE: 0004. PDOC) 

CK2_PHOSPHO_SITE (S,T)X2 (D, E) 

(T)X{2}(E) 
20: CGNST THPE HAGRI 

(S)X{2}(E) 
53: QGRKA SLEE IQLVH 

(ABSTRACT FILE: 0006. PDOC) 

MYRISTYL G~(E,D,R,K,H,P,F,Y,W)X2(S,T,A,G,C,N)~(P) 

G~(E,D,R,K,H,P,F,Y,W)X{2} (T)~P 
16: HQCVC GNSTTH PEHAG 

G~ (E / D / R / K / H f P / F,Y / W)X{2> (S)~P 
100: SLPCG GLGVST 

(ABSTRACT FILE: 0008. PDOC) 

PKC JPHOSPHO_SITE (S, T)X(R, K) 

(S)X(K) 

89: LLGDD SQK FFSSL 

(ABSTRACT FILE: 0005. PDOC) 



FIG. 4 



WO 02/102323 



PCT7US02/19560 



7/66 



ValAspSerAspThrlleTrpAsnGluLeuHisSerSerGlyAlaAlaArgMetAlaVal 
1 GTGGACAGTGACACCATTTGGAATGAGCTACACTCGTCCGGTGCTGCACGCATGGCTGTT 
CACCTGTCACTGTGGTAAACCTTACTCGATGTGAGCAGGCCACGACGTGCGTACCGACAA 

GlyCysVallleGluLeuAlaSerLysValAlaSerGlyGluLeuLysAsnGlyPheAla 
6 1 GGCTGTGTCATCGAGCTGGCTTCCAAAGTGGCCTC AGGAGAGCTGAAGAATGGGTTTGCT 
CCGACACAGTAGCTCGACCGAAGGTTTCACCGGAGTCCTCTCGACTTCTTACCCAAACGA 

ValValArgProProGlyHisHisAlaGluGluSerThrAlaMetGlyPheCysPhePhe 
121 GTTGTGAGGC CCCCTGGCC ATC ACGCTGAAGAATCC AC AGC CATGGGGTTCTGCTTTTTT 
CAACACTCCGGGGGACCGGTAGTGCGACTTCTTAGGTGTCGGTACCCCAAGACGAAAAAA 

AsnS erVal Alal 1 eThr Al aLy sTyrLeuAr gAspG InLeuAsnl leSerLys I leLeu 
181 AATTCAGTTGCAATTACCGCCAAATACTTGAGAGACCAACTAAATATAAGCAAGATATTG 
TTAAGTCAACGTTAATGGCGGTTTATGAACTCTCTGGTTGATTTATATTCGTTCTATAAC 

IleValAspLeuAspValHisHisGlyAsnGlyThrGlnGlnAlaPheTyrAlaAspPro 
241 ATTGT AGATC TGGATGTTCACC ATGGAAACGGTAC CCAGC AGGCCTTTTATGCTGAC CCC 
TAACATCTAGACCTACAAGTGGTACCTTTGCCATGGGTCGTCCGGAAAATACGACTGGGG 

SerlleLeuTyrlleSerLeuHisArgTyrAspGluGlyAsnPhePh-eProGlySerGly 
301 AGCATCCTGTACATTTCACTCCATCGCTATGATGAAGGGAACTTTTTCCCTGGCAGTGGA 
TCGTAGGACATGTAAAGTGAGGTAGCGATACTACTTCCCTTGAAAAAGGGACCGTCACCT 

AlaProAsnGluValGlyThrGlyLeuGlyGluGlyTyrAsnlleAsnlleAlaTrpThr 
361 GC CCCAAATGAGGTTGGAAC AGGCCT TGGAGAAGGGT ACAATATAAATATTGC CTGGAC A 
CGGGGTTTACTCCAACCTTGTCCGGAACCTCTTCCCATGTTATATTTATAACGGACCTGT 

GlyGlyLeuAspProProMetGlyAspValGluTyrLeuGluAlaPheArgLeuValLeu 
42 1 GGTGGCCTTGATC CTC C CATGGGAGATGTTGAGTACCTTGAAGCATTC AGGTTGGTACTT 
CCACCGGAACTAGGAGGGTACCCTCTACAACTCATGGAACTTCGTAAGTCCAACCATGAA 

LeuSerLeu 
481 CTTTCTCTC 
GAAAGAGAG 
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GENE WISE RESULTS FROM HDA5_HUMAN_RUN3 APPLIED TO AC002410 
HIT 1: BITS = 262 

BAC START: 15451 

BAC END: 58122 

PROTEIN START: 7 86 

PROTEIN END: 948 

>RESULTS FOR GCGPROT : HDA5_HUMAN VS AC002410 (FORWARD) [0] 
GENEWISEDB OUTPUT 

SCORE 2 62.30 BITS OVER ENTIRE ALIGNMENT. 

THIS WILL BE DIFFERENT FROM PER- AL IGNMENT SCORES. SEE MANUAL FOR DETAILS 
FOR COMPUTER PARSABLE OUTPUT , TRY GENEWISEDB -HELP OR READ THE MANUAL 
SCORES AS BITS OVER A SYNCHRONOUS CODING MODEL 

ALIGNMENT 1 SCORE 261.25 (BITS) 



HDA5 786 VDSDTVWNEMHS SSAVRMAVGCL 

VDSDT+WNE + HSS A RMAVGC + 

VDSDTIWNELHS.SGAARMAVGCV 

AC 002410 15451 GTGGACAGTGACACCATTTGGAATGAGCTACACTCGTCCGGTGCTGCACGCATGGCTGTT 

LELAFKVAAGELK 
+ ELA KVA + GELK 

IELASKVASGELK 
ATCGAGCTGGCTTCCAAAGTGGCCTCAGGAGAGCTGAAG 

HDA5 822 NGFAIIRPPGHHAEES 

NGFA + + RPPGHHAEES 
NGFAVVRPPGHHAEES 

AC002410 15559 GTGAGGT INTRON 1 CAGAATGGGTTTGCTGTTGTGAGGCCCCCTGGCCATCACGCTGAAGAATCC 
<0 [15559:51266]-0> 

HDA5 838 TA GFCF FNSVAIT 

T A GFCF FNSVAIT 

T A M:M[ATGJ GFCFFNSVAIT 

AC002410 51315 ACAGCCATGTAAGTA INTRON 2 C AGGGGGTTCTGCTTTTTTAATTCAGTTGC AAT TACC 

<2 [51323:51566J-2> 

HDA5 852 AKLLQQKLNVGKVL IVDW 

AK L+ +LN+ K + LIVD 

AKYLRDQLNISKIL IVDL 

AC002410 51601 GCCAAATACTTGAGAGACCAACTAAATATAAGCAAGATATTGATTGTAGATCTGGTATGTA INTRON 3 

<0— [51655:57572] 

HDA5 870 DIHHGNGTQQAFYNDPSVLYISL 

D + HHGNGTQQAFY DPS + LYISL 

DVHHGNGTQQAFYADPSILYISL 

AC 002410 57570 TAGGATGTTCACCATGGAAACGGTACCCAGCAGGCCTTTTATGCTGACCCCAGCATCCTGTACATTTCACTC 
-0> 

HRYDNGNFFPGSG 
HRYD GNFFPGSG 
HRYDEGNFF PGSG 
CATCGCTATGATGAAGGGAACTTTTTCCCTGGCAGTGGA 

HDA5 906 APEE VGGGPGVGYNVN 

APE VGGGGYN + N 

APNE VGTGLGEGYNIN 

AC002410 57681 GCCCCAAATGAGGTTCGGT INTRON 4 CAGGTTGGAACAGGCCTTGGAGAAGGGTACAATATAAAT 

<0 [57693:58005]-0> 
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HDA5 922 VAWTGGVDPPIGDVEYLTAFRTVV 

+ AWTGG + DPP + GDVEYL AFR V + 

IAWTGGLDPPMGDVEYLEAFRLV L 

AC002410 58042 ATTGCCTGGACAGGTGGCCTTGATCCTCCCATGGGAGATGTTGAGTACCTTGAAGCATTCAGGTTGGTACTT 

M P I 
+ + 
L S L 
CTTTCTCTC 
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PROSITE motifs identified in the partial predicted amino acid sequence of 
BMY_HDAL2 . 

MOTIFS FROM: BMY_HDAL2 . AA . FASTA 
MISMATCHES: 0 



BMY_HDAL2 . AA . FASTA CHECK: 2381 LENGTH: 163 ! 

ASN_GLYCOSYLATION N- ( P ) ( S , T ) ~ ( P ) 

N~P(S)~P 
75: LRDQL NISK . ILIVD 

N~P(T)~P 
90: DVHHG NGTQ QAFYA 

(ABSTRACT FILE: 0001. PDOC) 

MYRISTYL G~(E,D,R,K,H,P,F,Y,W)X2(S,T,A,G,C,N)~(P) 

G~(E,D,R,K,H,P,F,Y,W)X{2} (A)~P 
91: VHHGN GTQQAF YADPS 

G~(E,D,R,K,H,P,F,Y,W)X{2} (G) ~P 
126: APNEV GTGLGE GYNIN 

G~(E,D,R,K,H,P,F,Y,W)X{2} (G)~P 
128: NEVGT GLGEGY NINIA 

(ABSTRACT FILE: 00 08. PDOC) 

PKC_PHOSPHO_SITE (S,T)X(R,K) 

(T)X(K) 

66: NSVAI TAX YLRDQ 

(ABSTRACT FILE: 0005. PDOC) 
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GENEWISE RESULTS FROM HDA5_HUMAN_RUN3 APPLIED TO AC004994 
HIT 1: BITS =176 

BAG START: 79767 

BAC END: 11 

PROTEIN START: 942 

PROTEIN END: 1055 

>RESULTS FOR GCGPROT : HDA5_HUMAN VS AC004994 (REVERSE) [0] . 
GENEWISEDB OUTPUT 

SCORE 176.62 BITS OVER ENTIRE ALIGNMENT. 

THIS WILL BE DIFFERENT FROM PER- ALIGNMENT . SCORES . SEE MANUAL FOR DETAILS 
FOR COMPUTER PARSABLE OUTPUT, TRY GENEWISEDB -HELP OR READ THE MANUAL 
SCORES AS BITS OVER A SYNCHRONOUS CODING MODEL 

ALIGNMENT 1 SCORE 174.85 (BITS) 



HDA5_HUMAN 942 RTVVMPIAHE FS PDVVLVSAGF DA 
RT + V P + A EF PD + VLVSAGF DA 

RTIVKPVAKEFDPDMVLVSAGFDA 

AC 004994 -79767 AGGACCATCGTGAAGCCTGTGGCC AAAGAGTTTGATCCAGACATGGTCTTAGTATCTGCTGGATTTGATGCA 
VEGHLSPLGGYSVTA 
+ EGH PLGGY VT A 

LEGHTPPLGGYKVTA 
TTGGAAGGCCACACCCCTCCTCTAGGAGGGTACAAAGTGACGGCA 

HDA5'_JIUMAN 981 R FGHLTRQLMTLA 
+ FGHLT + QLMTLA 

K C:C[TGT] FGHLTKQLMTLA 

AC004994 -79650 AAATGTAAGTA INTRON 1 TAGGTTTTGGTCATTTGACGAAGCAATTGATGACATTGGCT 

<1 [79646:18435]-1> 

HDA5_HUMAN 995 GGRVVLAL EGGHDLTAICDASEAC 
GRVVLALEGGHDLTAICDAS EAC 

DGRVVLALEGGHDLTAICDASEAC 
AC004994 -18396 GATGGACGTGTGGTGTTGGCTCTAGAAGGAGGACATGATCTCACAGCCATCTGTGATGCATCAGAAGCCTGT 

VSALLSVE 

V + A L L E 

VNALLGNE 

GTAAATGCCCTTCTAGGAAATGAG 

HDA5 — HUMAN 1027 LQPLDEAVLQQKPNIN 

L + PL E +L Q PN + N 

LEP LAEDILHQSPNMN 

AC004994 -18300 GTAAAAA INTRON 2 CAGCTGGAGCCACTTGCAGAAGATATTCTCCACCAAAGCCCGAATATGAAT 
<0 [18300: 98]-0> 

HDA5_HUMAN 1043 AV ATLEKVI.E I Q S 

AV +L + K + IEIQS 

AVISLQKIIEIQS 
AC004994 -49 GCTGTTATTTCTTTACAGAAGATCATTGAAATTCAAAGT 
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GENEWISE RESULTS FROM HDA5_HUMAN_RUN3 APPLIED TO AC004744 
HIT 1: BITS = 57 

BAC START: 85491 

BAC END: 43563 

PROTEIN START: 1022 

PROTEIN END: 1122 

>RESULTS FOR GCGPROT : HDA5_HUMAN VS AC004744 (REVERSE) [0] 
GENEWISEDB OUTPUT 

SCORE 57.38 BITS OVER ENTIRE ALIGNMENT. 

THIS WILL BE DIFFERENT FROM PER- ALIGNMENT SCORES. SEE MANUAL FOR DETAILS 
FOR COMPUTER PARSABLE OUTPUT, TRY GENEWISEDB -HELP OR READ THE MANUAL 
SCORES AS BITS OVER A SYNCHRONOUS CODING MODEL 

ALIGNMENT 1 SCORE 55.39 (BITS) 

HDA5 1022 LLSVELQPLDEAVLQQKPN 
LL + + L + PL E +L Q PN 

LLFLQLEPLAEDILHQSPN 

AC004744 -85491 CTACTATTCTTGCAGCTGGAGCCACTTGCAGAAGATATTCTCCACCAAAGCCCGAAT 

INAVATLEKVIEIQ 
+ NAV +L + K + IEIQ 

MNAVISLQKIIEIQ 
ATGAATGC TGTTATTTCTTTACAGAAGATC ATTGAAATTCAA 

HDA5 1055 KHWSCVQKFAAGL 

K + W V + A 

S:S[AGC] KYWKSVRMVAVPR 
AC004744 85392 AGTATGTC INTRON 1 TAGGCAAGTATTGGAAGTCAGTAAGGATGGTGGCTGTGCCAAGG 
<1 t85391:63817]-l> 

HDA5 1069 GRSLREAQA GET EEAETVSAM 

G +L A Q EE ETVSA + 

GCALAGAQL — Q EETETVSAL 

AC004744 -63775 GGCTGTGCTCTGGCTGGTGCTCAGTTG CAAGAGGAGACAGAGACCGTTTCTGCCCTG 

ALLSVGAEQAQA AAARE H 
A L + V E Q A 

ASLTVDVEQPFA Q E 

GCCTCCCTAACAGTGGATGTGGAACAGCCCTTTGCT CAGGAA 

HDA5 1108 SP PAEEPMBQEPAL 

A EPME + EPAL 

D S R:R[AGA] TAGEPMEEEPAL 

AC004744 -63676 GACAGCAGGTATGAA INTRON 2 CAGAACTGCTGGTGAGCCTATGGAAGAGGAGCCAGCCTTG 
<2 [63668:43600]-2> 
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1 50^ 

AC004744 (1) 

AC004994 (1) aggaccatcgtgaagcctgtggccaaagagtttgatccagacatggtct 
BMY_HDAL3 (1) aggaccatcgtgaagcctgtggccaaagagtttgatccagacatggtct 



51 



100 



AC004744 (1) 

AC0049 94 (50) tagtatctgctggatttgatgcattggaaggccacacccctcctctagga 

BMY_HDAL3 (50) tagtatctgctggatttgatgcattggaaggccacacccctcctctagga 

101 150 

AC004744 (1) 

AC0049 94 (100) gggtacaaagtgacggcaaaatgttttggtcatttgacgaagcaattgat 

BMY_HDAL3 (100) gggtacaaagtgacggcaaaatgttttggtcatttgacgaagcaattgat 



» 
» 



AC004744 
AC004994 
BMY_HDAIj3 



151 



200 



(1) 



(150) gacattggc tgatggacgtgtggtgttggctctagaaggaggacatgatc 
(150) gacattggctgatggacgtgtggtgttggctctagaaggaggacatgatc 



AC004744 
AC004994 
BMY_HDAL3 



201 



250 



(1) 



(200) tcacagccatctgtgatgcatcagaagcctgtgtaaatgcccttctagga 
(200) tcacagccatctgtgatgcatcagaagcctgtgtaaatgcccttctagga 



251 300 
AC004744 (1) agctggagccacttgcagaagatattctccaccaaagcccgaatat 
ACO 049 9 4 (250) aatgagctggagccacttgcagaagatattc tccaccaaagcccgaatat 
BMY_HDAIj3 (250) aatgagctggagccacttgcagaagatattctccaccaaagcccgaatat 

301 * "' 350 

ACO 047 44 (50) gaatgctgttatttctttacagaagatcattgaaattcaaagcaagtatt 
AC004994 (300) gaatgctgttatttctttacagaagatcattgaaattcaaa 
BMY_HDAL3 (300) gaatgctgttatttctttacagaagatcattgaaattcaaagcaagtatt 

351 400 
AC004744 (100) ggaagtcagtaaggatggtggctgtgccaaggggctgtgctctggctggt 
AC004994 (O40) 

BMY_HDAIj3 (350) ggaagtcagtaaggatggtggctgtgccaaggggctgtgctctggctggt 

401 450 
ACO 04744 (150) gctcagttgcaagaggagacagagaccgtttctgccctggcctccctaac 
AC004994 (040) 

BMY__HDAIj3 ( 400 ) gctcagttgcaagaggagacagagaccgtttctgccctggcctccctaac 

451 500 
ACO 047 44 (200) agtggatgtggaacagccc tttgc tcaggaagacagcagaac tgctggtg 
AC004994 (.340) 

BMY_HDAL3 (450) agtggatgtggaacagccc tttgc tcaggaagacagcagaac tgctggtg 



501 525 
. AC004744 (250) agcctatggaagaggagccagcctt 

AC004994 (040) 
BMY_HDAL3 (500) agcctatggaagaggagccagcctt 
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ArgThrlleValLysProValAlaLysGluPheAspProAspMetValLeuValSerAla 
1 AGGACCATCGTGAAGCCTGTGGCCAAAGAGTTTGATCCAGACATGGTCTTAGTATCTGCT 
TC CTGGTAGC ACTTCGGACAC CGGTTTCTC AAACTAGGTCTGTACC AGAATC ATAGACGA 

GlyPheAspAlaLeuGluGlyHisThrProProLeuGlyGlyTyrLysValThrAlaLys 
6 1 GGATTTGATGC ATTGGAAGGC CAC ACCCCTCC TCTAGGAGGGTACAAAGTGACGGC AAAA 
CC TAAACTACGTAACCTTCCGGTGTGGGGAGGAGAT C CTCCC ATGTTTC AC TGC CGTTTT 

CysPheGlyHisLeuThrLysGlnLeuMetThrLeuAlaAspGlyArgValValLeuAla 
12 1 TGTTTTGGTCATTTGACGAAGCAATTGATGACATTGGCTGATGGACGTGTGGTGTTGGCT 
AC AAAAC C AGTAAAC TGCTTCGTTAACTACTGT AACC GACTACCTGC AC AC CACAACCGA 

LeuGluGlyGlyHisAspLeuThrAlalleCysAspAlaSerGltLAlaCysValAsnAla 
181 C TAGAAGGAGGAC ATGATCTC AC AGCC ATCTGTGATGC ATC AGAAGC CTGTGTAAATGCC 
GATCTTCCTCCTGTACTAGAGTGTCGGTAGACACTACGTAGTCTTCGGACACATTTACGG 

LeuLeuGlyAsnGluXieuGluProLeuAlaGluAspIleLeuHisGlnSerProAsnMet 
241 CTTCTAGGAAATGAGCTGGAGCCACTTGCAGAAGATATTCTCCACCAAAGCCCGAATATG 
GAAGATCCTTTACTCGACCTCGGTGAACGTCTTCTATAAGAGGTGGTTTCGGGCTTATAC 

AsnAlaVallleSerLeuGlnLysIlelleGluIleGlnSerLysTyrTrpLysSerVal 
301 AATGCTGTTATTTCTTTACAGAAGATCATTGAAATTCAAAGCAAGTATTGGAAGTCAGTA 
TTACGACAATAAAGAAATGTCTTCTAGTAACTTTAAGTTTCGTTCATAACCTTCAGTCAT 

Ar gMe tVa 1 AlaVal Pr oArgGlyCysAlaLeuAl aGlyAlaGlnL euGlnGluGluThr 
361 AGGATGGTGGCTGTGCC AAGGGGCTGTGCTCTGGCTGGTGCTCAGTTGCAAGAGGAGACA 
TCCTACCACCGACACGGTTCCCCGACACGAGACCGACCACGAGTCAACGTTCTCCTCTGT 

GluThrValSerAlaLeuAlaSerLeuThrValAspValGluGlnProPheAlaGlnGlu 
421 GAGACCGTTTCTGCCCTGGCCTCCCTAACAGTGGATGTGGAACAGCCCTTTGCTCAGGAA 
CTCTGGCAAAGACGGGACCGGAGGGATTGTCACCTACACCTTGTCGGGAAACGAGTCCTT 

AspSerArgThrAlaGlyGluProMetGluGluGluProAlaLeu 
481 GACAGCAGAACTGCTGGTGAGCCTATGGAAGAGGAGCCAGCCTTG 
CTGTCGTCTTGACGAC C ACTCGGATAC CTT CTCCTCGGTCGGAAC 
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PROSITE MOTIFS FROM: BMY_HDAL3 . AA . FASTA 
MISMATCHES :0 

BMY_HDAL3.AA. FASTA CHECK: 3930 LENGTH: 175 I 

CK2__PHOSPHO„SITE (S,T)X2 (D,E) 

(T)X{2) (D) 
51: TKQLM TLAD GRWL 

(T)X{2}(E) 
164: QEDSR TAGE PMEEE 

(ABSTRACT FILE: 0006. PDOC) 

MYRISTYL G~(E,D,R,K,H,P,F,Y,W)X2 (S,T,A,G,C,N)~(P) 

G~(E,D,R,K,H,P,F,Y,W)X{2} (A) ~P 
128: VAVPR GCALAG AQLQE 

(ABSTRACT FILE: 0008. PDOC) 

PKC_PHOSPHO_SITE (S,T)X(R, K) 

(T)X(K) 

38: GGYKV TAK CFGHL 

(S)X(R) 

119: SKYWK SVR MVAVP 

(ABSTRACT FILE: 0005. PDOC) 
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Multiple sequence alignment of BMYJHDAL3, AAC78618 and AAD15364 



1 ____ 50 

AAC78618 (1) -ffiv%gff£ggj^^ 

AAD15364 (1) II — IS. -— - 

BMY_HDAL3 (1) R«iB lMM gMS»MMti 



AAC78618 
AAD15364 
BMY_HDAL3 



100 




(51) p& 



AAC78618 
AAD15364 
BMY_HDAL3 



(100) 
(16) 
(101) 



101 



NAVISLQKIIEIQ 
NAVISLQKIIEIQ 
KTAVISLQKIIEIQ 



150 



KLLVSLWKRSQPCEVPSPPLIFPVCDIIVYPPTPVPS 
SKYWKSVKMVAVPRGCALAGAQLQEETETVSALASLT 



151 175 

AAC78618 (113) 

AAD15364 (66) DMSCLLPGWHRFNGT 

BMY_HDAL3 (151) VDVEQ PFAQEDSRTAGE PMEE E PAL 
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BLASTN alignment of AA287983 and BMY.HDAL3 



SCORE a 224 BITS (113), EXPECT = 4E-57 
IDENTITIES = 120/121 (99%) , GAPS = 1/121(0%) 
STRAND = PLUS / MINUS 



BMY_HDAL3 : 
464 

AA287983: 
148 



405 ATTTTGCCGTCACTTTGTACCCTCCTAGAGGAGGGGTGTGGCCTTCCAATGC ATCAAATC 

IMIIIIIIIIIIIIIIIIilllllllllllllllllllllllllllMlllllllllll 

207 ATTTTGCCGTCACTTTGTACCCTCCTAGAGGAGGGGTGTGGCCTTCCAATGCATCAAATC 



BMY_HDAL3 : 
524 

AA287983: 



465 CAGCAGATACTAAGACCATGTCTGGATCAAACTCTTTGGCCACAGGCTTCACGATGGTCC 

IMIIIIIINIIIIIIIIIIMIIIIIIIIIIIMI MIIIII1IMIIIIIMIIII 

147 CAGCAGATACTAAGACCATGTC TGGATC AAAC TCTTT - GCC AC AGGCT TC AC GATGGTC C 89 



BMY_HDAL3: 525 T 525 
I 

AA287983: 88 T 88 
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AquifexACUC Protein 

1 MKKVKLIGTL DYGKYRYPKN HPLKIPRVSL LLRFKDAMNL IDEKELIKSR 
51 PATKEELLLF HTEDYINTLM EAERCQCVPK GAREKYNIGG YENPVSYAMF 
101 TGSSIiATGST VQAIEEFLKG NVAFNPAGGM HHAFKSRANG FCYINNPAVG 
151 IEYLRKKGFK RILYIDLDAH HCDGVQEAFY DTDQVFVLSL HQSPEYAFPF 
201 EKGFLEEIGE GKGKGYNLNI PLPKGLNDNE FLFALEKSLE IVKEVFEPEV 
251 YLLQLGTDPL IiEDYL SKFNIj SNVAFLKAFN IVREVFGEGV YLGGGGYHPY 
301 ALARAWTLIW CELSGREVPE KLNNKAKELL KSIDFEEFDD EVDRSYMLET 
351 LKDPWRGGEV RKEVKDTLEK AKASS 



FIG. 14A 



Saccharomyces Cerevisiae Histone Deacetylase 1 

1 MDSVMVKKEV LENPDHDLKR KLEENKEEEN SLSTTSKSKR QVIVPVCMPK 
51 IHYSPLKTGL CYDVRMRYHA KIFTSYFEYI DPHPEDPRRI YRIYKILAEN 
101 GLINDPTLSG VDDLGDLMLK IPVRAATSEE I LEVHTKEHIi EFIESTEKMS 
151 REELLKETEK GDSVYFNNDS YASARLPCGG AIEACKAWE GRVKN S LAW 
201 RPPGHHAEPQ AAGGFCLFSN VAVAAKNILK NYPESVRRIM ILDWDIHHGN 
251 GTQKSFYQDD QVLYVSLHRF EMGKYYPGTI QGQYDQTGEG KGEGFNCNIT 
301 WPVGGVGDAE YMWAFEQWM PMGREFKPDL VTISSGFDAA DGDTIGQCHV 
351 TPSCYGHMTH MLKSLARGNL CWLEGGYML DAIARSALSV AKVLIGEPPD 
401 ELPDPIiSDPK PEVIEMIDKV IRLQSKYWNC FRRRHANSGC NFNEPINDSI 
451 ISKNFPLQKA IRQQQQHYLS DEFNFVTLPL VSMDLPDNTV LCTPNISESN 
501 TIIIWHDTS DIWAKRNVI S GTIDLSSSVI IDNSLDFIKW GLDRKYGIID 
551 VNIPLTLFEP DNYSGMITSQ EVLIYLWDNY IKYFPSVAKI AFIGIGDSYS 
601 GIVHLLGHRD TRAVTKTVIN FLGDKQLKPL VPLVDETLSE WYFKNSLIFS 
651 NNSHQCWKEN ESRKPRKKFG RVLRCDTDGL NNIIEERFEE ATDFILDSFE 
701 EWSDEE 
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Homo Sapiens Histone Deacetylase 4 



1 MSSQSHPDGL SGRDQPVELL NPARVNHMPS TVDVATALPL QVAPSAVPMD 
51 LRLDHQFSLP VAEPALREQQ LQQELLALKQ KQQIQRQILI AEFQRQHEQL 
101 SRQHEAQLHE HIKQQQEMLA MKHQQELLEH QRKLERHRQE QELEKQHREQ 
151 KLQQLKNKEK GKESAVASTE VKMKLQEFVL NKKKALAHRN LNHCISSDPR 
201 YWYGKTQHSS LDQSSPPQSG VSTSYNHPVL GMYDAKDDFP LRKTASEPNL 
251 KLRSRLKQKV AERRSSPLLR RKDGPWTAL KKRPLDVTDS ACSSAPGSGP 
301 SSPNNSSGSV SAENGIAPAV PSIPAETSLA HRLVAREGSA APLPLYTSPS 
351 LPNITLGLPA TGPSAGTAGQ QDTERLTLPA LQQRLSLFPG THLTPYLSTS 
401 PLERDGGAAH SPLLQHMVLL EQPPAQAPLV TGLGALPLHA QSLVGADRVS 
451 PSIHKLRQHR PLGRTQSAPL PQNAQALQHL VIQQQHQQFL EKHKQQFQQQ 
501 QLQMNKIIPK PSEPARQPES HPEETEEELR EHQALLDEPY LDRL PGQKEA 
551 HAQAGVQVKQ EPIESDEEEA EPPREVEPGQ RQPSEQELLF RQQALLLEQQ 
601 RIHQLRNYQA SMEAAGIPVS FGGHRPIiSRA QSSPASATFP VSVQEPPTKP 
651 RFTTGLVYDT LMLKHQCTCG SSSSHPEHAG RIQSIWSRLQ ETGLRGKCEC 
701 IRGRKATLEE LQTVHSEAHT LLYGTNPLNR QKLDSKKLLG SLASVFVRLP 
751 CGGVGVDSDT IWNEVHSAGA ARLAVGCWE LVFKVATGEL KNGFAWRPP 
801 GHHAEESTPM GFCYFNSVAV AAKLLQQRLS VSKILIVDWD VHHGNGTQQA 
851 FYSDPSVLYM SLHRYDDGNF FPGSGAPDEV GTGPGVGFNV NMAFTGGLDP 
901 PMGDAEYLAA FRTWMPIAS EFAPDWLVS SGFDAVEGHP TPLGGYNLSA 
951 RCFGYLTKQL MGLAGGRIVL ALEGGHDLTA ICDASEACVS ALLGNELDPL 

1001 PEKVLQQRPN ANAVRSMEKV MEIHSKYWRC LQRTTSTAGR SLIEAQTCEN 

1051 EEAETVTAMA SLSVGVKPAE KRPDEEPMEE EPPL 
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Homo Sapiens Histone Deacetylase 5 

1 MNSPNESDGM SGREPSLEIL PRTSIiHSIPV TVEVKPVLPR AMPSSMGGGG 
51 GGSPSPVELR GALVGSVDPT LREQQLQQEL LALKQQQQLQ KQLLFAEFQK 
101 QHDHLTRQHE VQLQKHLKQQ QEMLAAKQQQ EMLAAKRQQE LEQQRQREQQ 
151 RQEELEKQRL EQQLLILRNK EKSKESAIAS TEVKLRLQEF LLSKSKEPTP 
201 GGLNHSLPQH PKCWGAHHAS LDQSSPPQSG PPGTPPSYKL PLPGPYDSRD 
251 DFPLRKTASE PNLKVRSRLK QKVAERRSSP LLRRKDGTVI STFKKRAVEI 
301 TGAGPGASSV CNSAPGSGPS SPNSSHSTIA ENGFTGSVPN IPTEMLPQHR 
351 ALPLDSSPNQ FSLYTSPSLP NISLGLQATV TVTNSHLTAS PKLSTQQEAE 
401 RQALQSLRQG GTLTGKFMST SSIPGCLLGV ALEGDGSPHG HASLLQHVLL 
451 LEQARQQSTL IAVPLHGQSP LVTGERVATS MRTVGKLPRH RPLSRTQSSP 
501 LPQSPQALQQ LVMQQQHQQF LEKQKQQQLQ LGKILTKTGE LPRQPTTHPE 
551 ETEEELTEQQ EVLLGEGALT MPREGSTESE STQEDLEEED EEEDGEEEED 
601 CIQVKDEEGE SGAEEGPDLE EPGAGYKKLF SDAQPLQPLQ VYQAPLSLAT 
651 VPHQALGRTQ SSPAAPGGMK SPPDQPVKHL FTTGWYDTF MLKHQCMCGN 
701 THVHPEHAGR IQSIWSRLQE TGLLSKCERI RGRKATLDEI QTVHSEYHTL 
751 LYGTSPLNRQ KLDSKKLLGP ISQKMYAVLP CGGIGVDSOT VWNEMHSSSA 
801 VRMAVGCLLE LAFKVAAGEL KNGFAIIRPP GHHAEESTAM GFCFFNSVAI 
851 TAKLLQQKLN VGKVIjIVDWD IHHGNGTQQA FYNDPSVLYI SLHRYDNGNF 
901 FPGSGAPEEV GGGPGVGYNV NVAWTGGVDP PIGDVEYLTA FRTWMPIAH 
951 EFSPDWXiVS AGFDAVEGHL SPLGGYSVTA RCFGHLTRQL MTLAGGRWL 
1001 ALEGGHDLTA ICDASEACVS ALLSVELQPL DEAVLQQKPN INAVATLEKV 
1051 IEIQSKHWSC VQKFAAGLGR SLREAQAGET EEAETVSAMA LLSVGAEQAQ 
1101 AAAAREHSPR PAEEPMEQEP AL 
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Homo Sapiens Histone Deacetylase 7 

1 MDLRVGQRPP VEPPPEPTLL ALQRPQRLHH HLFLAGLQQQ RSVEPMRLSM 

51 DTPMPELQVG PQEQELRQLL HKDKSKRSAV ASSWKQKLA EVILKKQQAA 

101 LERTVHPNSP GIPYRTLEPL ETEGATRSML SSFLPPVPSL PSDPPEHFPL 

151 RKTVSEPNLK LRYKPKKSLE RRKNPLLRKE SAPPSLRRRP AETLGDSSPS 

201 SSSTPASGCS SPNDSEHGPN PILGDSDRRT HPTLGPRGPI LGSPHTPLFL 

251 PHGLEPEAGG TLPSRLQPIL LLDPSGSHAP LLTVPGLGPL PFHFAQSLMT 

301 TERLSGSGLH WPLSRTRSEP LPPSATAPPP PGPMQPRKEQ LKTHVQVIKR 

351 SAKPSEKPRL RQIPSAEDLE TDGGGPGQW DDGLEHRELG HGQPEARGPA 

401 PLQQHPQVLL WEQQRLAGRL PRGSTGDTVL L.PLAQGGHRP LSRAQSSPAA 

451 PASLSAPEPA SQARVLSSSE TPARTLPFTT GLIYDSVMLK HQCSCGDNSR 

501 HPEHAGRIQS IWSRLQERGL RSQCECLRGR KASLEELQSV HSERHVLLYG 

551 TNPLSRLKLD NGKLAGLLAQ RMFEMLPCGG VGVDTDT I WN ELHSSNAARW 

601 AAGSVTDLAF KVASRELKNG FAWRPPGHH ADHSTAMGFC FFNSVAIACR 

651 QLQQQSKASK ASKILIVDWD VHHGNGTQQT FYQDPSVLYI SLHRHDDGNF 

701 FPGSGAVDEV GAGSGEGFNV NVAWAGGLDP PMGDPEYLAA FRIWMPIAR 

751 EFSPDLVLVS AGFDAAEGHP APLGGYHVSA KCFGYMTQQL MNLAGGAWIi 

801 ALEGGHDLTA ICDASEACVA ALLGNRVDPL SEEGWKQKPQ PQCHPLSGGR 

851 DPGAQ 
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Human ESTAA287983 

1 ggccttggagaagggtacaatataaatattgcctggacaggtggcctt 

49 gatcctcccatgggagatgttgagtaccttgaagcattcaggaccatc 

97 gtgaagcctgtggcaaagagtttgatccagacatggtcttagtatctg 

145 ctggatttgatgcattggaaggccacacccctcctctaggagggtaca 

193 aagtgacggcaaaataaactcctgtgctggaggtacaacagtttggaa 

241 gtatacttggggaaagagaaaacacaagatggaaggaagatctctctt 

289 ttcacatcgggagcac 



FIG. 14F 



Human predicted protein AAD15364 

1 LEPLAEDILH QSPNMNAVIS LQKIIEIQKL LVSLWKRSQP CEVPSPPLIF 
51 FVCDIIVYPP TPVPSDMSCL LPGWHRFNGT 



FIG. 14G 



Human predicted protein AAC78618 

1 TIVKPVAKEF DPDMVLVSAG FDALEGHTPP LGGYKVTAKC FGHLTKQLMT 
51 LADGRWLAL EGGHDLTAIC DASEACVNAL LGNELEPLAE DILHQSPNMN 
101 AVISLQKIIE IQ 
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1 ATGCACAGTATGATCAGCTCAGTGGATGTGAAGTCAGAAGTTCCTGTGGGCCTGGAGCCC 6 0 

1MHSMISSVDVKSEVPVGLEP 20 

61 ATCTCACCTTTAGACCTAAGGACAGACCTCAGGATGATGATGCCCGTGGTGGACCCTGTT 12 0 

21 I SPLDLRTDLRMMMPVVDPV 40 

121 GTCCGTGAGAAGCAATTGCAGCAGGAATTACTTCTTATCCAGCAGCAGCAACAAATCCAG 180 

41VREKQLQQELLLIQQQQQIQ 60 

• 181 AAGCAGCTTCTGATAGCAGAGTTTCAGAAACAGCATGAGAACTTGACACGGCAGCACCAG 240 

61KQLL IAEFQKQHENLTRQHQ 80 

241 GCTCAGCTTCAGGAGCATATCAAGTTGCAACAGGAACTTCTAGCCATAAAACAGCAACAA 300 

81AQLQEHIKLQQE LLAIKQQQ 100 

301 GAACTCCTAGAAAAGGAGCAGAAACTGGAGCAGCAGAGGCAAGAACAGGAAGTAGAGAGG 3 60 

101 ELLEKEQKLE QQRQEQEVER 120 

361 CATCGCAGAGAACAGCAGCTTCCTCCTCTC AGAGGCAAAGATAGAGGACGAGAAAGGGCA 420 

121 HRREQQLPPLRGKDRGRERA 140 

421 GTGGCAAGTACAGAAGTAAAGCAGAAGCTTCAAGAGTTCCTACTGAGTAAATCAGCAACG 480 

141 VASTEVKQKLQEFLLSKS AT 160 

481 AAAGACACTCCAACTAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTCTGGTAC 540 

161 KDTPTNGKNHSVSRHPKLWY 180 

541 ACGGCTGCCCACCACACATCATTGGATCAAAGCTCTCCACCCCTTAGTGGAACATCTCCA 600 

181 T AAHHTSLDQS SPPLSGTSP 200 

601 TCCTAC AAGTACACATTACC AGGAGCACAAGATGCAAAGGATGATTTCCCCCTTCGAAAA 660 

201 SYKYTLPGAQDAKDDFPLRK 220 

661 ACTGC CTCTGAGC CCAACTTGAAGGTGCGGTCCAGGTTAAAACAGAAAGTGGC AGAGAGG 720 

221 TAS E PNLKVRS RLKQKVAER 240 

721 AGAAGCAGCCCCTTACTCAGGCGGAAGGATGGAAATGTTGTCACTTCATTCAAGAAGCGA 780 

241 RSS P L LRRKDG NVV T S FKKR 260 

781 ATGTTTGAGGTGACAGAATCCTCAGTCAGTAGCAGTTCTCCAGGCTCTGGTCCCAGTTCA 840 

261 MFEVTESSVSSSSPGSGPSS 280 

841 CCAAACAATGGGCCAACTGGAAGTGTTACTGAAAATGAGACTTCGGTTTTGCCCCCTACC 900 

281 PNNG PTGSVTENETSVLPPT 300 

901 CC TC ATGC CGAGCAAATGGTTTCACAGCAACGC ATTCTAATTCATGAAGATTC CATGAAC 960 

301 PHAEQMVSQQRIL I HEDSMN 320 

961 CTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCAACATTACCTTGGGGCTTCCCGCAGTG 1020 

321 LLSLYTSPSLPNITLGLPAV 340 

1021 CCATCCCAGCTCAATGCTTCGAATTCACTC AAAGAAAAGCAGAAGTGTGAGACGCAGACG 1080 

341 PSQLNASNSLKEKQKCETQT 360 

1081 CTTAGGCAAGGTGTTCCTCTGCCTGGGCAGTATGGAGGC AGCATCCCGGCATCTTCCAGC 1140 

361 LRQGVPLPGQYGGS IPASS S 380 

1141 CACCCTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTCCTGCAG 1200 

381 H PHVTLEGKP PNS S HQALL Q 400 

1201 CATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGTAGCTGGTGGAGTTCCC 1260 

401 HLLLKEQMRQQKLLVAGGVP 420 

1261 TTACATCCTCAGTCTCCCTTGGC AACAAAAGAGAGAATTTCACCTGGC ATTAGAGGTAGC 1320 

421 L-HPQSPLATKERISPGIRGT 440 

1321 CACAAATTGCCCCGTCACAGACCCCTGAACCGAACCCAGTCTGCACCTTTGCCTCAGAGC 1380 

441 HKLPR HRPLNR TQSAPLPQS 460 

1381 ACGTTGGCTCAGCTGGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAGAAGCAA 1440 

461 TLAQLVIQQQHQQFLEKQKQ 480 
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1441 


TACCAGCAGCAGATCCACATGAACAAACTGCTTTCGAAATCTATTGAACAACTGAAGCAA 


1500 


481 


YQQQIHMNKLLSKSIEQLKQ 


500 


1501 


CCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAGGGGGACCAGGCGATGCAGGAA 


1560 


501 


PGSHLEEAEEELQGDQAMQE 


520 


1561 


GACAGAGCGCCCTCTAGTGGCAACAGCACTAGGAGCGACAGCAGTGCTTGTGTGGATGAC 


1620 


521 


DRAPSSGNSTRSDSSACVDD 


540 


1621 


ACACTGGGACAAGTTGGGGCTGTGAAGGTCAAGGAGGAACCAGTGGACAGTGATGAAGAT 


1680 


541 


TLGQVGAVKVKEE PVDSDED 


560 


1681 


GCTCAGATCCAGGAAATGGAATCTGGGGAGCAGGCTGCTTTTATGCAACAGCCTTTCCTG 


1740 


561 


AQIQEMESGEQAAFMQQPFL 


580 


1741 


GAACCCACGCACACACGTGCGCTCTCTGTPpPPPAAGPTPCGPTPP 


1800 


581 


EPTHTRALSVRQAPLAAVGM 


600 


1801 


GATGGATTAGAGAAACACCGTCTCGTCTCCAGGACTPArTrTTCCCCTGrTGrPTr'TGTT 


1860 


601 


DGLEKHRLVSRTHS SPAASV 


620 


1861 


TTACCTCACPPGGPAATGGAPCGPPCPPTPPAPPPTGGPTPTGPAAPTPPAATTPPPTAT 


J- j & \J 


621 


LPHPAMDRPLQPGSATGIAY 


640 


1921 


GAPCPPTTG ATPPTP A A AP A PP APTPPPTTTPTPPP A A TTPP A PP A PPP ft PPPTP&PP AT 


i pan 


641 


DPLMLKHQCV CGNSTTHPEH 


660 


1981 


GCTGGAPGA ATAPAPAPTATPTPPTP APPAPTPP A AP A A APTPPPPTPPTA A aTl A ATPT 




661 


AGRIQSIWSRLQETGLLNKC 


680 


2041 


GAGPGAATTP A APPTPP A A A APPP APPPfpn anrja a ftTft c ft hpttpttp a f T ,r nr , T , ca ft p ftT 




681 


ERIQGRKASLEEIQLVHSEH 


700 


2101 


P A PTP A PTPTTPT A tppp A pp A A p p p ppTnrc ft nan ft p ft n ft a f2r*TPf2 a ppp n ft nnft r rAr ,r PP 


91 £fi 


701 


HSLLYGTNPLDGQKLDPRIL 


720 


2161 


CTAGGTGATGAPTPTPAAAAPTTTTTTTPPTPATTAPPTTPTPPTPPAPTTPPPPTPPAP 

x x s^n, j. vjau x v_ x uncuuiu XXXXXXXl^l_X A X X AV-V_ X X \3 X VJV!X 1 VJVJAV^ x X ovtVjVj X ViVj a^» 


999D 


721 


LGDDSQKFFSSLPCGGLGVD 


740 


2221 


AQTC? AP AP P A tttpp A atp A PPT 1 A P A P r ppp r ppr , r , m r rp ptcp ft mr* ft tpjp r rr vci r pnyzn.r ,r vcz r P 




741 


SDTIWNELHS SGAARMAVGC 


760 


2281 


GTCA TCGAGC TPGPTTPPAAAGTGGP PTP A PP A PAPPTP A A P A A TPPPTTTP P TPTTPTP 


^. o y: \J 


761 


V I E LAS KVA S GE L KNG F'AVV 


780 


2341 


AGGC C C C PTPPPP ATP APPP TP A A P A A TP PAPA PP P A TPPPPTTP TP PTTTTTT A A TTP A 




781 


R PPGHHAEESTAMGFCFFNS 


800 


2401 


GTTGC AATT ACCGCCAAA TACTTGAGAGAP P AAP TAAA T AT A AGP AAG A TAT TP ATTGTA 


2460 


801 


VAITAKYLRDQLNI SKILIV 


820 


2461 


GATC TGGATGTTC AC C ATGGAAACGGTAC P P AG P AG G CP TTTTATGP TG A P PP P AGP ATP 


2520 


821 


DLDVHHGNGTQQAFYADPSI 


840 


2521 


CTGTACATTTCACTCCATCGCTATGATGAAGGGAACTTTTTCCCTGGCAGTGGAGCCCCA 


2580 


841 


LYISLHRYDEGNFFPGSGAP 


860 


2581 


AATGAGGTTGGAACAGGCCTTGGAGAAGGGTACAATATAAATATTGCCTGGACAGGTGGC 


2640 


861 


NEVGTGLGEGYNINIAWTGG 


880 


2641 


CTTGATCCTCCCATGGGAGATGTTGAGTACCTTGAAGCATTCAGGACCATCGTGAAGCCT 


2700 


881 


LDPPMGDVEYLEAFRTIVKP 


900 


2701 


GTGGCCAAAGAGTTTGATCCAGACATGGTCTTAGTATCTGCTGGATTTGATGCATTGGAA 


2760 


901 


VAKEFDPDMVLVSAGFDALE 


920 


2761 


GGCCAC^CCCCTCCTCTAGGAGGGTACAAAGTGACGGCAAAATGTTTTGGTCATTTGACG 


2820 


921 


GHTPPLGGYKVTAKCFGHLT 


940 


2821 


AAGCAATTGATGACATTGGCTGATGGACGTGTGGTGTTGGCTCTAGAAGGAGGACATGAT 


2880 


941 


KQLMTLADGRVVLALEGGHD 


960 
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2881 CTCACAGCCATCTGTGATGCATCAGAAGCCTGTGTAAATGCCCTTCTAGGAAATGAGCTG 2940 

961 LTAICDASEACVNALLGNEL 980 

2941 GAGCC ACTTGCAGAAGATATTCTCC ACCAAAGCC CGAATATGAATGCTGTTATTTCTTTA 3000 

981 E PLAEDILHQS PNMNAVISL 1000 

3001 CAGAAGATCATTGAAATTCAAAGC AAGTATTGGAAGTCAGTAAGGATGGTGGCTGTGCCA 3060 

1001 QKIIEIQSKYWKSVRMVAVP 1020 

3061 AGGGGCTGTGCTCTGGCTGGTGCTCAGTTGC AAGAGGAGAC AGAGACCGTTTCTGCCCTG 3120 

1021 RGCALAGAQLQEETET VSAL 1040 

3121 GCCTCCCTAACAGTGGATGTGGAACAGCCCTTTGCTCAGGAAGACAGCAGAACTGCTGGT 3180 

1041 ASLTVDVEQPFAQEDSRTAG 1060 

3181 GAGCCTATGGAAGAGGAGCCAGCCTTGTGAAGTGCCAAGTCCCCCTCTGATATTTCCTGT 3240 

1061 EPMEEEPAL 1069 

3241 GTGTGACATCATTGTGTATCCCCCC ACCCCAGTACCCTCAGACATGTCTTGTCTGCTGCC 3300 

3301 TGGGTGGCACAGATTC AATGGAAC ATAAACACTGGGCACAAAATTCTGAACAGCAGCTTC 3360 

33 61 ACTTGTTCTTTGGATGGACTTGAAAGGGCATTAAAGATTCCTTAAACGTAACCGCTGTGA 3420 

3421 TTCTAGAGTTACAGTAAACCACGATTGGAAGAAACTGCTTCCAGCATGCTTTTAATATGC 3480 

3481 TGGGTGACCCACTCCTAGACACCAAGTTTGAACTAGAAACATTCAGTACAGCACTAGATA 3 540 

3541 TTGTTAATTTCAGAAGCTATGACAGCCAGTGAAATTTTGGGC AAAACCTGAGACATAGTC 3 600 

3661 GTTTACAAGATTTGCTTTTAGCTATGAACGGATCGTAATTCC ACCCAGAATGTAATGTTT 3720 

3721 CTTGTTTGTTTGTTTTGTTTTGTTAGGGTTTTTTTCTC AACTTTAACACACAGTTGAACT 3780 

3781 GTTCCTAGTAAAAGTTCAAGATGGAGGAACTAGCATGAGGCTTTTTTCAGTATCTCGAAG 3840 

3841 TCCAAATGCCAAAGGAACCTCACACACTGTTTGTAATGGTGCAATATTTTATATCACTTT 3900 

3901 TTTTTAAACATCCCCAACATCTTTGTGTTCTCACACACAGGCAATTTGCAATGTTGCAAT 3960 

3961 TGTGTTGGAGAATGAAGTCCCCCCACCTCCCAGCCAC ACACACATCCTTTGTTCTC ATGA 4020 

4021 CAGTAGGTCTGAGCAAATGTTCCACC AAGCATTTTCAGTGTCTTTGAAAAGC ACGTAACT 4080 

4081 TTTCAAAGGTGGTCTTAATTTGCTGCATATCTATCAAGGACTTATTCACTCACCTTTCCT 4140 

4141 TTTCTGCCCTCTATCAATTGATTTCTTCTTACCTTTCATCATTCATTCCTTCCTTTAGAA 4200 

4201 AAACTGAAGATTACCCATAATCTCCTCTTATTACTTGAGGGCCTTGACTATTTAGTTTAT 4260 

42 61 TTTGTTTACTTTACAGGTTAACACAGTTGTTTTGTCTGATTGCATTTTATTAACTGTGAA 4320 

4321 GCCGTTGAAATGAATATC ACTTAAGCAACGTTGCTAAATTTCTATGTGTTTGAAATGTGT 4380 

4381 TAATGAAGGC ACTGCTTATTTGTAGTCACCTTGAACTGACTTAACCTAGAAGCTGTGCCT 4440 

4441 TCTTGTGAAAAAAAAAAAAAAAAAAAA 4467 
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2901 GG AAATGAGCTG GAGCCACTTG 

2951 CAGAAGATAT TCTCCACCAA AGCCCGAATA TGAATGCTGT TATTTCTTTA 

3 001 CAGAAGATCA TTGAAATTCA AAGCAAGTAT TGGAAGTCAG TAAGGATGGT 

3 051 GGCTGTGCCA AGGGGCTGTG CTCTGGCTGG TGCTCAGTTG CAAGAGGAGA 

3101 CAGAGACCGT TTCTGCCCTG GCCTCCCTAA CAGTGGATGT GGAACAGCCC 

3151 TTTGCTCAGG AAGACAGCAG AACTGCTGGT GAGCCTATGG AAGAGGAGCC 

3201 AGCCTTGTGA 
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AlaGluAsnGluThrSerValLeuProProThrProHisAlaGluGlnMetValSerGln 
1 GCTGAAAATGAGACTTCGGTT.TTGCCCCCTACCCCTCATGCCGAGCAAATGGTTTCACAG 

GlnArglleLeuIleHisGluAspSerMetAsnLeuIjeuSerLeuTyrThrSerProSer 
6 1 CAACGCATTCTAATTCATGAAGATTCGATGAACCTGCTAAGTCTTTATACCTCTCCTTCT 

LeuProAsnlleThrLeuGlyLeuProAlaValProSerGlnLeuAsnAlaSerAsnSer 
121 TTGCCCAACATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAATTCA 

LeuLysGluLysGlnLysCysGluThrGlnThrLeuArgGlnGlyValProIjeuProGly 
181 CTCAAAGAAAAGCAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGTTCCTCTGCCTGGG 

GlnTyrGlyGlySerlleProAlaSerSerSerHisProHisValThrLeuGluGlyLys 
241 CAGTATGGAGGCAGC ATCCCGGCATCTTCCAGCCACCCTCATGTTACTTTAGAGGGAAAG 

ProProAsnSerSerHisGlnAlaLeuLeuGlnHisLeuLeuLeuLysGluGlnMetArg 
301 CCACCCAACAGCAGCCACCAGGCTCTCCTGCAGCATTTATTATTGAAAGAACAAATGCGA 

GlnGlnLysLeuLeuValAlaGlyGlyValProLeuHisProGlnSerProLeuAlaThr 
3 61 CAGCAAAAGCTTCTTGTAGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACA 

LysGluArglleSerProGlylleArgGlyThrHisLysLeuProArgHisArgProLeu 
421 AAAGAGAGAATTTCACCTGGC ATTAGAGGTACCC ACAAATTGCCCCGTCACAGACCCCTG 

AsnArgThrGlnSerAlaProLeuProGlnSerThrLeuAlaGlnLeuVallleGlnGln 
481 AACCGAACCCAGTCTGCACCTTTGCCTCAGAGCACGTTGGCTC AGCTGGTCATTCAACAG 

GlnHisGlnGlnPheLeuGluLysGlnLysGlnTyrGlnGlnGlnlleHisMetAsnLys 
541 CAACACCAGCAATTCTTGGAGAAGCAGAAGCAATACCAGCAGCAGATCCACATGAACAAA 

LeuLeuSerLysSerlleGluGlnljeuLysGlnProGlySerHisLeuGluGluAlaGlu 
601 CTGCTTTCGAAATCTATTGAACAACTGAAGCAACCAGGCAGTCACCTTGAGGAAGCAGAG 

GluGluLeuGlnGlyAspGlnAlaMetGlnGluAspArgAlaProSerSerGlyAsnSer 
661 GAAGAGC TTCAGGGGGACCAGGCGATGCAGGAAGACAGAGCGC CCTCTAGTGGCAACAGC 

ThrArgSerAspSerSerAlaCysValAspAspThrLeuGlyGlnValGlyAlaValLys 
721 ACTAGGAGCGAC AGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTGAAG 

ValLysGluGluProValAspSerAspGluAspAlaGlnlleGlnGluMetGluSerGly 
781 GTCAAGGAGGAACCAGTGGACAGTGATGAAGATGCTCAGATCCAGGAAATGGAATCTGGG 

GluGlnAlaAlaPheMetGlnGlnProPheLeuGluProThrHisThrArgAlaLeuSer 
841 GAGCAGGCTGCTTTTATGCAACAGCCTTTCCTGGAACCCACGCACACACGTGCGCTCTCT 

ValArgGlnAlaProLeuAlaAlaValGlyMetAspGlyLeuGluLysHisArgLeuVal 
901 GTGCGCCAAGCTCCGCTGGCTGCGGTTGGCATGGATGGATTAGAGAAAC ACCGTCTCGTC 

SerArgThrHisSerSerProAlaAlaSerValLeuProHisProAlaMetAspArgPro 
961 TCCAGGACTCACTCTTCCCCTGCTGCCTCTGTTTTACCTCACCCGGCAATGGACCGCCCC 

LeuGlnProGlySerAlaThrGlyileAlaTyrAspProLeuMetLeuLysHisGlnCys 
1021 CTCCAGCCTGGCTCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAGTGC 

ValCysGlyAsnSerThrThrHisProGluHisAlaGlyArglleGlnSerlleTrpSer 
1081 GTTTGTGGCAATTCCACCACCCACCCTGAGCATGCTGGACGAATACAGAGTATCTGGTCA 

FIG. 20A 



WO 02/102323 



PCT/US02/19560 



ArgLeuGlnGluThrGlyLeuLeuAsnLysCysGluArglleGlnGlyArgLysAlaSer 
1141 CGACTGCAAGAAACTGGGCTGCTAAATAAATGTGAGCGAATTCAAGGTCGAAAAGCCAGC 

LeuGluGluIleGlnLeuValHisSerGluHisHisSerLeuLeuTyrGlyThrAsnPro 
,1201 CTGGAGGAAATACAGCTTGTTCATTCTGAACATCACTCACTGTTGTATGGCACCAACCCC 

LeuAspGlyGlnLysLeuAspProArglleLeuLeuGlyAspAspSerGlnLysPhePhe 
1 2 61 CTGGACGGAC AGAAGCTGGACCCCAGGATACTCCTAGGTGATGACTCTCAAAAGTTTTTT 

SerSerLeuProCysGlyGlyLeuGlyValAspSerAspThrlleTrpAsnGluLeuHis 
1321 TCCTCATTACCTTGTGGTGGACTTGGGGTGGAC AGTGACACCATTTGGAATGAGCTACAC 

SerSerGlyAlaAlaArgMetAlaValGlyCysVallleGluLeuAlaSerLysValAla 
1381 TCGTCCGGTGCTGCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGCTTCCAAAGTGGCC 

SerGlyGluLeuLysAsnGlyPheAlaValValArgProProGlyHisHisAlaGluGlu 
1441 TCAGGAGAGCTGAAGAATGGGTTTGCTGTTGTGAGGCC C CCTGGCCATCACGCTGAAGAA 

SerThrAlaMetGlyPheCysPhePheAsnSerValAlalleThrAlaLysTyrLeuArg 
1501 TCCACAGCCATGGGGTTCTGCTTTTTTAATTCAGTTGCAATTACCGCCAAATACTTGAGA 

AspGlnLeiiAsnlleSerLysIleLeuIleValAspLeuAspValHisHisGlyAsnGly 
1561 GACCAACTAAATATAAGCAAGATATTGATTGTAGATCTGGATGTTCACCATGGAAACGGT 

ThrGlnGlnAlaPheTyrAlaAspProSerlleLeuTyrlleSerLeuHisArgTyrAsp 
1621 ACCCAGCAGGCCTTTTATGCTGACCCCAGCATCCTGTACATTTCACTCCATCGCTATGAT 

GluGlyAsnPhePheProGlySerGlyAlaProAsnGluValGlyThrGlyLeuGlyGlu 
1681 GAAGGGAACTTTTTCCCTGGCAGTGGAGCCCCAAATGAGGTTGGAACAGGCCTTGGAGAA 

GlyTyrAsnlleAsnlleAlaTrpThrGlyGlyLeuAspProProMetGlyAspValGlu 
1741 GGGTACAATATAAATATTGCCTGGACAGGTGGCCTTGATCCTCCCATGGGAGATGTTGAG 

TyrLeuGluAlaPheArgThrlleValLysProValAlaLysGluPheAspProAspMet 
1801 TACCTTGAAGCATTCAGGACCATCGTGAAGCCTGTGGCCAAAGAGTTTGATCCAGACATG 

ValLeuValSerAlaGlyPheAspAlaLeuGluGlyHisThrProProLeuGlyGlyTyr 
1861 GTCTTAGTATCTGCTGGATTTGATGCATTGGAAGGCCACACCCCTCCTCTAGGAGGGTAC 

LysValThrAlaLysCysPheGlyHisLeuThrLysGlnLeuMetThrLeuAlaAspGly 
1921 AAAGTGACGGC AAAATGTTTTGGTCATTTGACGAAGCAATTGATGACATTGGCTGATGGA 

ArgValValLeuAlaLeuGluGlyGlyHisAspLeuThrAlalleCysAspAlaSerGlu 
1981 CGTGTGGTGTTGGCTCTAGAAGGAGGACATGATCTCACAGCCATCTGTGATGCATCAGAA 

AlaCysValAsnAlaLeuLeuGlyAsnGluLeuGluProLeuAlaGlnAspIleLeuHis 
2041 GCCTGTGTAAATGCCCTTCTAGGAAATGAGCTGGAGCCACTTGCAGAAGATATTCTCCAC 

GlnSerProAsnMetAsnAlaVallleSerLeuGlnliysIlelleGluIleGlnSerljys 
2101 CAAAGCCCGAATATGAATGCTGTTATTTCTTTACAGAAGATCATTGAAATTCAAAGCAAG 

TyrTrpLysSerValArgMetValAlaValProArgGlyCysAlaLeioAlaGlyAlaGln 
2161 TATTGGAAGTCAGTAAGGATGGTGGCTGTGCCAAGGGGCTGTGCTCTGGCTGGTGCTCAG 

LeuGlnGluGluThrGluThrValSerAlaLeuAlaSerLeuThrValAspValGluGln 
2221 TTGCAAGAGGAGACAGAGACCGTTTCTGCCCTGGCCTCCCTAACAGTGGATGTGGAACAG 
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ProPheAlaGlnGluAspSerArgThrAlaGlyGluProMetGluGluGluProAlaLeu 

2281 CCCTTTGCTCAGGAAGACAGCAGAACTGCTGGTGAGCCTATGGAAGAGGAGCCAGCCTTG 

★ * * 

2341 TGAAGTGCCAAGTCCCCCTCTGATATTTCCTGTGTGTGACATC ATTGTGTATCCCCCC AC 

2401 CCCAGTACCCTCAGACATGTCTTGTCTGCTGCCTGGGTGGCACAGATTCAATGGAACATA 

2461 AACACTGGGCAC AAAATTCTGAACAGCAGCTTC ACTTGTTCTTTGGATGGACTTGAAAGG 

2521 GCATTAAAGATTCCTTAAACGTAACCGCTGTGATTCTAGAGTTACAGTAAACC ACGATTG 

2 58 1 GAAGAAACTGCTTCCAGCATGCTTTTAATATGC TGGGTGACCCACTCCTAGACACCAAGT 

2641 TTGAACTAGAAACATTCAGTACAGCACTAGATATTGTTAATTTCAGAAGCTATGACAGCC 

2701 AGTGAAATTTTGGGCAAAACCTGAGACATAGTCATTCCTGACATTCTGATCAGCTTTTTT 

2761 TGGGGTAATTTGTTTTTCAAACAGTCTTAACTTGTTTACAAGATTTGCTTTTAGCTATGA 

2821 ACGGATCGTAATTCCACCC AGAATGTAATGTTTCTTGTTTGTTTGTTTTGTTTTGTTAGG 

2881 GTTTTTTTCTCAACTTTAACACACAGTTC AACTGTTCCTAGTAAAAGTTC AAGATGGAGG 

2941 AACTAGCATGAGGCTTTTTTCAGTATCTCGAAGTCCAAATGCCAAAGGAACCTCACACAC 

3001 TGTTTGTAATGGTGC AATATTTTATATCACTTTTTTTTAAACATCCCCAACATCTTTGTG 

3061 TTCTCACACACAGGCAATTTGCAATGTTGCAATTGTGTTGGAGAATGAAGTCCCCCCACC 

3121 TCCCAGCCACACACACATCCTTTGTTCTCATGACAGTAGGTCTGAGCAAATGTTCCACCA 

3181 AGCATTTTCAGTGTCTTTGAAAAGCACGTAACTTTTCAAAGGTGGTCTTAATTTGCTGCA 

3241 TATCTATCAAGGACTTATTCACTC ACCTTTCCTTTTCTGCCCTCTATCAATTGATTTCTT 

3301 CTTACCTTTCATCATTCATTCCTTCCTTTAGAAAAACTGAAGATTACCCATAATCTCCTC 

3361 TTATTACTTGAGGGCCTTGACTATTTAGTTTATTTTGTTTACTTTACAGGTTAACACAGT 

3421 TGTTTTGTCTGATTGCATTTTATTAACTGTGAAGCCGTTGAAATGAATATCACTTAAGCA 

3481 ACGTTGCTAAATTTCTATGTGTTTGAAATGTGTTAATGAAGGCACTGCTTATTTGTAGTC 

3541 ACCTTGAACTGACTTAACCTAGAAGCTGTGCCTTCTTGTGAAAAAAAA2\AAAAAAAAAAA 

3601 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
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1 CCACGCGTCCGTAGGAGAAGGGCACCGGCTGGAGCCACTTGCAGGACTGAGGGTTTTTGC 

61 AACAAAACCCTAGCAGCCTGAAGAACTCTAAGCCAGGTTTAATTGGTTTCTTTTTCTCGT 

121 GGGTAGACTTAATAATTTTCTACGTATTCTGACAAAGAAATAACCCCGAAGCACGTTCCT 

181 ATTTCCCACCTGCTTGTAGTTTCCGGGATAACCTAAACTCCAGAGAGCTATAGCATCCAC 

241 . TCTGTCCTTTCTGCTTTGCACACAGATGGGGTGGCTGGACGAGAGCAGCTCTTGGCTCAG 

MetHisSerMetlleSerSerValAspValLysSerGluValProValGlyLeu 
301 CAAAGAATGCACAGTATGATCAGCTC AGTGGATGTGAAGTCAGAAGTTCCTGTGGGCCTG 

GluProIleSerProLeuAspLeuArgThrAspLeuArgMetMetMetProValValAsp 
361 GAGCCCATCTCACCTTTAGACCTAAGGACAGACCTCAGGATGATGATGCCCGTGGTGGAC 

ProValValArgGluLysGlnLeuGlnGlnGluLeuLeuLeuIleGlnGlnGlnGlnGln 
421 CCTGTTGTCCGTGAGAAGCAATTGCAGCAGGAATTACTTCTTATCCAGCAGCAGCAACAA 

IleGlnLysGlnLeuLeuIleAlaGluPheGlnLysGlnHisGluAsnLeuThrArgGln 
481 ATCCAGAAGCAGCTTCTGATAGCAGAGTTTCAGAAACAGCATGAGAACTTGACACGGCAG 

HisGlnAlaGlnLeuGlnGluHisIleLysLeuGlnGlnGluLeuLeuAlalleLysGln 
541 CACCAGGCTCAGCTTCAGGAGCATATCAAGTTGCAACAGGAACTTCTAGCCATAAAACAG 

GlnGlnGluLeuLeuGluLysGluGlnLysLeuGluGlnGlnArgGlnGluGlnGluVal 
601 CAAC AAGAACTCCTAGAAAAGGAGCAGAAACTGGAGCAGCAGAGGCAAGAACAGGAAGTA 

GluArgHisArgArgGluGlnGlnLeuProProLeuArgGlyLysAspArgGlyArgGlu 
661 GAGAGGCATCGCAGAGAACAGCAGCTTCCTCCTCTCAGAGGCAAAGATAGAGGACGAGAA 

ArgAlaValAlaSerThrGluValLysGlnLysLeuGlnGluPheLeuLeuSerLysSer 
721 AGGGCAGTGGCAAGTACAGAAGTAAAGCAGAAGCTTCAAGAGTTCCTACTGAGTAAATCA 

AlaThrLysAspThrProThrAsnGlyLysAsnHisSerValSerArgHisProLysLeu 
781 GCAACGAAAGACACTCCAACTAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTC 

TrpTyrThrAlaAlaHisHisThrSerlieuAspGlnSerSerProProLeuSerGlyThr 
841 TGGTACACGGCTGCCCACCACACATCATTGGATCAAAGCTCTCCACCCCTTAGTGGAACA 

SerProSerTyrLysTyrThrLeuProGlyAlaGlnAspAlaLysAspAspPheProLeu 
901 TCTCC ATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAGGATGATTTCCCCCTT 

ArgLysThrAlaSerGluProAsnLeuLysValArgSerArgLeuLysGlnLysValAla 
961 CGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCAGGTTAAAACAGAAAGTGGCA 

GluArgArgSerSerProLeuLeuArgArgLysAspGlyAsnValValThrSerPheLys 
1021 GAGAGGAGAAGCAGC C CCTTACTCAGGCGGAAGGATGGAAATGTTGTCACTTCATTCAAG 

LysArgMetPheGluValThrGluSerSerValSerSerSerSerProGlySerGlyPro 
1081 AAGCGAATGTTTGAGGTGACAGAATCCTCAGTCAGTAGCAGTTCTCCAGGCTCTGGTCCC 

SerSerProAsnAsnGlyProThrGlySerValThrGl\aAsnGluThrSerValLeuPro 
1141 AGTTCACCAAAC AATGGGCCAACTGGAAGTGTTACTGAAAATGAGACTTCGGTTTTGCCC 

ProThrProHisAlaGluGlnMetValSerGlnGlnArglleLeuIleHisGluAspSer 
1201 CCTACCCCTCATGCCGAGCAAATGGTTTCACAGCAACGCATTCTAATTCATGAAGATTCC 

MetAsnLeuLeuSerLeuTyrThrSerProSerLeuProAsnlleThrLeuGlyLeuPro 
1261 ATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCAACATTACCTTGGGGCTTCCC 
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AlaValProSerGlnLeuAsnAlaSerAsnSerLeuLysGluLysGlnLysCysGluThr 
1321 GCAGTGCCATCCCAGCTCAATGCTTCGAATTCACTCAAAGAAAAGCAGAAGTGTGAGACG 

GlnThrLeuArgGlnGlyValProLeuProGlyGlnTyrGlyGlySerlleProAlaSer 
1381 CAGACGCTTAGGCAAGGTGTTCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCT 

SerSerHisProHisValThrLeuGluGlyLysProProAsnSerSerHisGlnAlaLeu 
1441 TCCAGCCACCCTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTC 

LeuGlnHisLeuLeuLeuLysGluGlnMetArgGlnGlnLysLeuLeuValAlaGlyGly 
1501 CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGTAGCTGGTGGA 

ValProLeuHisProGlnSerProLeuAlaThrLysGluArglleSerProGlylleArg 
1561 GTTCCCTTACATCCTC AGTCTCCCTTGGCAACAAAAGAGAGAATTTCACCTGGCATTAGA 

GlyThrHisLysLeuProArgHisArgProLeuAsnArgThrGlnSerAlaProLeuPro 
1621 GGTACCCACAAATTGCCCCGTCACAGACCCCTGAACCGAACCCAGTCTGCACCTTTGCCT 

GlnSerThrLeuAlaGlnLeuVallleGlnGlnGlnHisGlnGlnPheLeuGluLysGln 
1681 CAGAGCACGTTGGCTCAGCTGGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAG 

LysGlnTyrGlnGlnGlnlleHisMetAsnLysGluLeuProMetThrPro*** 
1741 AAGCAATACCAGCAGCAGATCCACATGAACAAAGAATTGCCTATGACCCCTTGATGCTGA 

1801 AACACCAGTGCGTTTGTGGCAATTCCACCACCCACCCTGAGCATGCTGGACGAATACAGA 

1861 GTATCTGGTCACGACTGCAAGAAACTGGGCTGCTAAATAAATGTGAGCGAATTCAAGGTC 

1921 GAAAAGCCAGCCTGGAGGAAATACAGCTTGTTCATTCTGAACATCACTCACTGTTGTATG 

1981 GCACCAACCCCCTGGACGGACAGAAGCTGGACCCCAGGATACTCCTAGGTGATGACTCTC 

2041 AAAAGTTTTTTTCCTCATTACCTTGTGGTGGACTTGGGGTGGACAGTGACACCATTTGGA 

2101 ATGAGCTACACTCGTCCGGTGCTGCACGCATGGCTGTTGGCTGTGTC ATCGAGCTGGCTT 

2161. CCAAAGTGGCCTCAGGAGAGCTGAAGGTGAGGTCCGGGTTGC ATTAAGTGTGGGAAATCC 

2221 AGAGAAGAAACTGAAACAGAGATGTTGTTATGTGGGAATTGCGGGGAGTGTGGCGTGGTA 

2281 ATAAAAGGAAGGGCAGAAGGAAGAGGGTAGAGATGGCCACTAAGGTGTGATAATAACTCA 

2341 TCTGTAGGCAGGGAGCAGCTCATCCTGCTCTCAGGGCCTTCTTCTGCCTGAGAACACTCT 

2401 GCAGTCAGGGCCCACCGGTGTGCATGTAAGAGCACAGAGATAATAAGCAAAGCTATGGTT 

2461 CAGGTTAAAAATACCTTTAGTATATACATGTCTGTCATGCCATCCTGAGATTCTCTTTTG 

2521 AGGCAATTTTAAAAATATGATTACTGAGAAGTGTGTATAAGCTCAGAATACCACCCAGAG 

•2581 AGAGGGAGGCAGAGAAAGGTAAATACCAGACGGGAAGGATTGGGAGGAGGAAGGAAATTG 

2641 TTGATTAGAAGGGTAATGATCCAGAGTGTGTTTTTCCATGAAAGAACTTAAAAAATGAGC 

2701 TATGCTTTATTGTTCTTTTCTTTTTATGGTCTCTTCTTTTCTAC ATCGTATGAAAAGAAC 

2761 AATGTCCAAACCCCAGCGTTTCCCAGTCTAAAC AATTTATAAAAGCTAGAGACCTGACAG 

2821 ACGTTGACATTTTATTTGGTATTTTAACAGTGCTATTTAAAGGTACGCCATGTGCGTCTT 

2881 GAATGC AGTTACCCCAATAAACTTTGTTGGTGCTAACACGGCCTTTTAATGC ACTAGTTC 

2941 ACACACTTCATGACGCAATCTGGGTCGTGATTGATTCGGTATTTTTAGCAATTGCGGGGC 

3001 TTAGGGAAATATATTATGACCAATAACATATGCACTGTGAGTTTTGTGAAACCAAGATAA 

3061 AATAATTAGGATTACTTTTCTTTATGTCTAGTGAATTTTTATTCAATTAC ATGGGACTCT 

3121 TCCAGTTGTGATTAAAAATGTGGAGTAGGAATGTGCACTTC ACAATGC AACGTTTGTCCA 

3181 AGAAGTCTTTACTCTTAACTCTTTAAAGAGTCAGAGCCTACGGAAATATAATTTTGATAG 

3241 GGTGAGCTCTATTTAAAAAGTAGATGTGCCTGTATATATTTGAC ATAAGTAGTATTAGGA 

33 01 CATTGCTCATCTCAGGGGATATATGGGGTCATTAATGTGGTGCTTACTCTTCAGTCTTTA 

3361 CCTTTGAAAATGAGCAAAAAAAAAAAAAAAA 
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1 GGGGAAGAGAGGCACAGACACAGATAGGAGAAGGGCACCGGCTGGAGCCACTTGCAGGAC 
61 TGAGGGTTTTTGCAACAAAACCCTAGCAGCCTGAAGAACTCTAAGCCAGATGGGGTGGCT 

MetHisSerMetlleSerSerValAspVal 
121 GGACGAGAGCAGCTCTTGGCTCAGCAAAGAATGCACAGTATGATCAGCTCAGTGGATGTG 

LysSerGluValProValGlyLeuGluProIleSerProLeuAspLeuArgThrAspLeu 
181 AAGTCAGAAGTTCCTGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTC 

ArgMetMetMetProValValAspProValValArgGluLysGlnLeuGlnGlnGluLeu 
241 AGGATGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGCAGGAATTA 

LeuLeuIleGlnGlnGlnGlnGlnlleGlnLysGlnLeuLeuIleAlaGluPheGlnLys 
301 CTTCTTATCCAGCAGCAGCAACAAATCCAGAAGCAGCTTCTGATAGCAGAGTTTCAGAAA 

GlnHisGluAsnLeuThrArgGlnHisGlnAlaGlnLeuGlnGluHisIleLysGluLeu 
361 CAGCATGAGAACTTGACACGGCAGCACCAGGCTCAGCTTCAGGAGCATATCAAGGAACTT 

LeuAlalleLysGlnGlnGlnGluLeuLeuGluLysGluGlnLysLeuGluGlnGlnArg 
42 1 CTAGCCATAAAACAGCAACAAGAACTCCTAGAAAAGGAGCAGAAACTGGAGCAGCAGAGG 

GlnGluGlnGluValGluArgHisArgArgGluGlnGlnLeuProProLeuArgGlyLys 
481 CAAGAACAGGAAGTAGAGAGGCATCGCAGAGAACAGCAGCTTCCTCCTCTCAGAGGCAAA 

AspArgGlyArgGluArgAlaValAlaSerThrGluValLysGlnLysLeuGlnGluPhe 
541 GATAGAGGACGAGAAAGGGC AGTGGC AAGTACAGAAGTAAAGCAGAAGCTTC AAGAGTTC 

LeuLeuSerLysSerAlaThrLysAspThrProThrAsnGlyLysAsnHisSerValSer 
601 CTACTGAGTAAATCAGCAACGAAAGACACTCCAACTAATGGAAAAAATCATTCCGTGAGC 

ArgHisProLysLeuTrpTyrThrAlaAlaHisHisThrSerLeuAspGlnSerSerPro 
661 CGCCATCCCAAGCTCTGGTACACGGCTGCCCACCACACATCATTGGATCAAAGCTCTCCA 

ProLeuSerGlyThrSerProSerTyrLysTyrThrLeuProGlyAlaGlnAspAlaLys 
721 CCCCTTAGTGGAACATCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAG 

AspAspPheProLeuArgLysThrAlaSerGluProAsnLeuLysValArgSerArgLeu 
781 GATGATTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCAGGTTA 

LysGlnLysValAlaGlxiArgArgSerSerProLeuLeTiArgArgLysAspGlyAsnVal 
841 AAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGGAAGGATGGAAATGTT 

ValThrSerPheLysLysArgMetPheGluValThrGluSerSerValSerSerSerSer 
901 GTCACTTCATTCAAGAAGCGAATGTTTGAGGTGACAGAATCCTCAGTCAGTAGCAGTTCT 

ProGlySerGlyProSerSerProAsnAsnGlyProThrGlySerValThrGluAsnGlu 
961 CC AGGCTCTGGTCCCAGTTCACCAAAC AATGGGCCAACTGGAAGTGTTACTGAAAATGAG 

ThrSerValLeuProProThrProHisAlaGluGlnMetValSerGlnGlnArglleLeu 
1021 ACTTCGGTTTTGCCCCCTACCCCTCATGCCGAGCAAATGGTTTCACAGCAACGCATTCTA 

IleHisGluAspSerMetAsnLeuLeuSerLeuTyrThrSerProSerLeuProAsnlle 
1081 ATTCATGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCAACATT 

ThrLeuGlyLeuProAlaValProSerGlnLeuAsnAlaSerAsnSerLeuLysGluLys 
1141 ACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAATTCACTCAAAGAAAAG 



FIG. 22A 



WO 02/102323 



PCT/US02/19560 



GlnLysCysGluThrGlnThrLeuArgGlnGlyValProLeuProGlyGlnTyrGlyGly 
1201 CAGAAGTGTGAGACGCAGACGCTTAGGC AAGGTGTTCCTCTGCCTGGGCAGTATGGAGGC 

SerlleProAlaSerSerSerHisProHisValThrLeuGluGlyLysProProAsnSer 
1261 AGCATCCCGGCATCTTCCAGCCACCCTCATGTTACTTTAGAGGGAAAGCCACCCAACAGC 

SerHisGlnAlaLeuLeuGlnHisLeuLeuLeuLysGluGlnMetArgGlnGlnLysLeu 
1321 AGCCACCAGGCTCTCCTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTT 

LeuValAlaGlyGlyValProLeuHisProGlnSerProLeuAlaThrLysGluArglle 
13 81 CTTGTAGC TGGTGGAGTTCCCTTACATC CTCAGTCTCCCTTGGCAACAAAAGAGAGAATT 

SerProGlylleArgGlyThrHisLysLeuProArgHisArgProIjeiiAsnArgThrGln 
1441 TCACCTGGCATTAGAGGTACCCACAAATTGCCCCGTCACAGACCCCTGAACCGAACCCAG 

SerAlaProLeuProGlnSerThrLeuAlaGlnLeuVallleGlnGlnGlnHisGlnGln 
1501 TCTGCACCTTTGCCTCAGAGCACGTTGGCTCAGCTGGTCATTCAACAGCAACACCAGCAA 

PheLeuGluLysGlnLysGlnTyrGlnGlnGlnlleHisMetAsnLysLeuLeuSerLys 
1561 TTCTTGGAGAAGCAGAAGCAATACC AGCAGCAGATCCACATGAACAAACTGCTTTCGAAA 

SerlleGluGlnLeuLysGlnProGlySerHisLeuGluGluAlaGluGluGluLeuGln 
1621 TCTATTGAACAACTGAAGCAACCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAG 

GlyAspGlnAlaMetGlnGluAspArgAlaProSerSerGlyAsnSerThrArgSerAsp 
1681 GGGGACCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGCAACAGCACTAGGAGCGAC 

SerSerAlaCysValAspAspThrLeuGlyGlnValGlyAlaValLysValLysGluGlu 
1741 AGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTGAAGGTCAAGGAGGAA 

ProValAspSerAspGluAspAlaGlnlleGlnGluMetGluSerGlyGluGlnAlaAla 
1801. CCAGTGGACAGTGATGAAGATGCTCAGATCCAGGAAATGGAATCTGGGGAGCAGGCTGCT 

PheMetGlnGlnProPheLeuGluProThrHisThrArgAlaLeuSerValArgGlnAla 
1861 TTTATGCAACAGCCTTTCCTGGAACCCACGCACACACGTGCGCTCTCTGTGCGCCAAGCT 

ProLe\iAlaAlaValGlyMetAspGlyLeuGluLysHisArgLeuValSerArgThrHis 
1921 CCGCTGGCTGCGGTTGGCATGGATGGATTAGAGAAACACCGTCTCGTCTCCAGGACTCAC 

SerSerProAlaAlaSerValLeuProHisProAlaMetAspArgProLeuGlnProGly 
1981 TCTTCCCCTGCTGCCTCTGTTTTACCTCACCCAGCAATGGACCGCCCCCTCCAGCCTGGC 

SerAlaThrGlylleAlaTyrAspProLeiiMetLeuLysHisGlnCysValCysGlyAsn 
2041 TCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAGTGCGTTTGTGGCAAT 

SerThrThrHisProGluHisAlaGlyArglleGlnSerlleTrpSerArgLeuGlnGlu 
2101 TCCACCACCCACCCTGAGCATGCTGGACGAATACAGAGTATCTGGTCACGACTGCAAGAA 

ThrGlyLeuLeuAsriLysCysGluArglleGlnGlyArgLysAlaSerLeuGluGluIle 
2161 ACTGGGCTGCTAAATAAATGTGAGCGAATTCAAGGTCGAAAAGCCAGCCTGGAGGAAATA 

GlnLeuValHisSerGluHisHisSerLeuLeuTyrGlyThrAsnProLeiiAspGlyGln 
2221 CAGCTTGTTCATTCTGAAC ATCACTC ACTGTTGTATGGCACCAACCCCCTGGACGGACAG 

LysLeuAspProArglleLeuLeuGlyAspAspSerGlnLysPhePheSerSerLeuPro 
2281 AAGCTGGACCCCAGGATACTCCTAGGTGATGACTCTCAAAAGTTTTTTTCCTCATTACCT 
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CysGlyGlyLeuGlyValAspSerAspThrlleTrpAsnGluLeuHisSerSerGlyAla 
2341 TGTGGTGGACTTGGGGTGGACAGTGACACCATTTGGAATGAGCTACACTCGTCCGGTGCT 

AlaArgMetAlaValGlyCysVallleGluLeuAlaSerLysValAlaSerGlyGluLeu 
2401 GCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGCTTCCAAAGTGGCCTCAGGAGAGCTG 

LysAsnGlyPheAlaValValArgProProGlyHisHisAlaGluGluSerThrAlaMet 
2461 AAGAATGGGTTTGCTGTTGTGAGGCCCCCTGGCCATCACGCTGAAGAATCCACAGCCATG 

GlyPheCysPhePheAsnSerValJllaIleThrAlaLysTyrLeiiArgAspGlriLe\iAsn 
2521 GGGTTCTGCTTTTTTAATTCAGTTGC AATTACCGCCAAATACTTGAGAGACC AACTAAAT 

IleSerLysIleLeuIleValAspLeuAspValHisHisGlyAsnGlyThrGlnGlnAla 
2581 ATAAGCAAGATATTGATTGTAGATCTGGATGTTCACCATGGAAACGGTACCCAGCAGGCC 

PheTyrAlaAspProSerlleLeuTyrlleSerLeuHisArgTyrAspGluGlyAsnPhe 
2 641 TTTTATGCTGACCCCAGCATCCTGTACATTTCACTCCATCGCTATGATGAAGGGAACTTT 



PheProGlySerGlyAlaProAsnGluValGlyThrGlyLeuGlyGluGlyTyrAsnlle 
2701 TTCCCTGGCAGTGGAGCCCCAAATGAGGTTGGAACAGGCCTTGGAGAAGGGTACAATATA 

AsnlleAlaTrpThrGlyGlyLeuAspProProMetGlyAspValGluTyrLeuGluAla 
2761 AATATTGCCTGGACAGGTGGCCTTGATCCTCCCATGGGAGATGTTGAGTACCTrGAAGCA 

PheArgThrlleValLysProValAlaLysGluPheAspProAspMetValLeuValSer 
2821 TTCAGGACCATCGTGAAGCCTGTGGCCAAAGAGTTTGATCCAGACATGGTCTTAGTATCT 

AlaGlyPheAspAlaLeuGluGlyHisThrProProLeuGlyGlyTyrLysValThrAla 
2881 GCTGGATTTGATGCATTGGAAGGCCACACCCCTCCTCTAGGAGGGTACAAAGTGACGGCA 

LysCysPheGlyHisLeuThrLysGlnLeuMetThrLeuAlaAspGlyArgValValLeu 
2941 AAATGTTTTGGTCATTTGACGAAGCAATTGATGACATTGGCTGATGGACGTGTGGTGTTG 

AlaLeuGluGlyGlyHisAspLeuThrAlalleCysAspAlaSerGluAlaCysValAsn 
3001 GCTCTAGAAGGAGGACATGATCTCACAGCCATCTGTGATGCATCAGAAGCCTGTGTAAAT 

AlaLeuLeuGlyAsnGluLeuGluProLeuAlaGluAspIleLeuHisGlnSerProAsn 
3061 GCCCTTCTAGGAAATGAGCTGGAGCCACTTGCAGAAGATATTCTCCACCAAAGCCCGAAT 

MetAsnAlaVallleSerLeuGlnliysIlelleGluIleGlnSerMetSerLeuLysPhe 
3121 ATGAATGCTGTTATTTCTTTACAGAAGATC ATTGAAATTCAAAGTATGTCTTTAAAGTTC 

Ser*** 
3181 TCTTAA 
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1 GGGGAAGAGAGGCACAGACACAGATAGGAGAAGGGCACCGGCTGGAGCCACTTGCAGGAC 
6 1 TGAGGGTTTTTGCAACAAAACCCTAGCAGCCTGAAGAACTCTAAGCCAGATGGGGTGGCT 

MetHisSerMetlleSerSerValAspVal 
121 GGACGAGAGCAGCTCTTGGCTCAGCAAAGAATGCACAGTATGATCAGCTCAGTGGATGTG 

LysSerGluValProValGlyLeuGluProIleSerProLeuAspLeuArgThrAspLeu 
181 AAGTCAGAAGTTCCTGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTC 

ArgMetMetMetProValValAspProValValArgGluLysGlnLeuGlnGlnGluLeu 
241 AGGATGATGATGCC CGTGGTGGAC CCTGTTGTCCGTGAGAAGCAATTGCAGC AGGAATTA 

LeuLeuIleGlnGlnGlnGlnGlnlleGlnLysGlnLeuLeuIleAlaGluPheGlnLys 
301 CTTCTTATCCAGCAGCAGCAACAAATCCAGAAGCAGCTTCTGATAGCAGAGTTTCAGAAA 

GlnHisGluAsnLeuThrArgGlnHisGlnAlaGlnLeuGlnGluHisIleLysGluLeu 
361 CAGCATGAGAACTTGACACGGCAGCACCAGGCTCAGCTTCAGGAGCATATCAAGGAACTT 

LeuAlalleLysGlnGlnGlnGluLeuLeuGluLysGluGlnLysLeuGluGlnGlnArg 
421 CTAGCCATAAAACAGCAACAAGAACTCCTAGAAAAGGAGCAGAAACTGGAGC AGCAGAGG 

GlnGluGlnGluValGluArgHisArgArgGluGlnGlnLeuProProLeuArgGlyLys 
481 CAAGAACAGGAAGTAGAGAGGCATCGCAGAGAACAGCAGCTTCCTCCTCTCAGAGGCAAA 

AspArgGlyArgGliiArgAlaValAlaSerThrGluValLysGlnLysLeuGlnGluPhe 
541 GATAGAGGACGAGAAAGGGC AGTGGCAAGTACAGAAGTAAAGCAGAAGCTTCAAGAGTTC 

LeuLeuSerLysSerAlaThrLysAspThrProThrAsnGlyLysAsnHisSerValSer 
601 CTACTGAGTAAATCAGCAACGAAAGACACTCCAACTAATGGAAAAAATCATTCCGTGAGC 

ArgHisProLysLeuTrpTyrThrAlaAlaHisHisThrSerLeuAspGlnSerSerPro 
661 CGCCATCCCAAGCTCTGGTACACGGCTGCCCACCACACATCATTGGATC AAAGCTCTCC A 

ProLeuSerGlyThrSerProSerTyrLysTyrThrLeuProGlyAlaGlnAspAlaLys 
721 CCCCTTAGTGGAACATCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAG 

AspAspPheProLeioArgLysThrAlaSerGluProAsnLeuLysValArgSerArgLeu 
781 GATGATTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCC AGGTTA 

LysGlnLysValAlaGluArgArgSerSerProLeuLeuArgArgLysAspGlyAsnVal 
841 AAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGGAAGGATGGAAATGTT 

ValThrSerPheLysLysArgMetPheGluValThrGluSerSerValSerSerSerSer . 
901 GTC ACTTCATTCAAGAAGCGAATGTTTGAGGTGACAGAATCCTCAGTCAGTAGCAGTTCT 

ProGlySerGlyProSerSerProAsnAsnGlyProThrGlySerValThrGluAsnGlu 
961 CCAGGCTCTGGTCCC AGTTCACCAAACAATGGGCCAACTGGAAGTGTTACTGAAAATGAG 

ThrSerValLeuProProThrProHisAlaGluGlnMetValSerGlnGlnArglleLeu 
1021 ACTTCGGTTTTGCCCCCTACCCCTC ATGCCGAGCAAATGGTTTCACAGCAACGCATTCTA 

IleHisGluAspSerMetAsnLeuLeuSerLeuTyrThrSerProSerLeuProAsnlle 
1081 ATTCATGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCAACATT 

ThrLeuGlyLeuProAlaValProSerGlnLeuAsnAlaSerAsnSerLeuLysGluLys 
1141 ACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAATTCACTCAAAGAAAAG 
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GlnLysCysGluThrGlnThrLeuArgGlnGlyValProlieuProGlyGlnTyrGlyGly 
1201 CAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGTTCCTCTGCCTGGGCAGTATGGAGGC 

SerlleProAlaSerSerSerHisProHisValThrLeuGluGlyLysProProAsnSer 
1261 AGCATCCCGGCATCTTCCAGCCACCCTCATGTTACTTTAGAGGGAAAGCCACCCAACAGC 

SerHisGlnAlaLeuLeuGlnHisLeuLeuLeuLysGluGlnMetArgGlnGlnLysLeu 
1321 AGCCACCAGGGTCTCCTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTT 

LeuValAlaGlyGlyValProLeuHisProGlnSerProLeixAlaThrLysGluArglle 
1381 CTTGTAGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGAGAATT 

Ser.ProGlylleArgGlyThrHisLysLeuProArgHisArgProLeuAsnArgThrGln 
1441 TCACCTGGCATTAGAGGTACCCACAAATTGCCCCGTCACAGACCCCTGAACCGAACCCAG 

SerAlaProLeuProGlnSerThrLeuAlaGlnLeuVallleGlnGlnGlnHisGlnGln 
1501 TCTGCACCTTTGCCTCAGAGCACGTTGGCTCAGCTGGTCATTCAACAGCAACACCAGCAA 

PheL euG luLy s G lnLy s GlnTyrGlnG InG lnl 1 eH i sMe t As nLy s L euL euSer Ly s 
1 5 6 i TTCTTGGAGAAGCAGAAGCAATACCAGCAGCAGATCCACATGAACAAACTGCTTTCGAAA 

SerlleGluGlnLeuLysGlnProGlySerHisLeuGluGluAlaGluGluGluLeuGln 
1621 TCTATTGAACAACTGAAGCAACC AGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAG 

GlyAspGlnAlaMetGlnGluAspArgAlaProSerSerGlyAsnSerThrArgSerAsp 
1681 GGGGACCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGCAACAGCACTAGGAGCGAC 

SerSerAlaCysValAspAspThrLeuGlyGlnValGlyAlaValLysValLysGluGlu 
1741 AGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTGAAGGTC AAGGAGGAA 

ProValAspSerAspGluAspAlaGlnlleGlnGliiMetGluSerGlyGluGlnAlaAla 
1801 CCAGTGGACAGTGATGAAGATGCTCAGATCCAGGAAATGGAATCTGGGGAGCAGGCTGCT 

PheMetGlnGlnProPheLeuGluProThrHisTlirArgAlaLeuSerValArgGlnAla 
1861 TTTATGCAACAGCCTTTCCTGGAACCCACGCAC ACACGTGCGCTCTCTGTGCGCCAAGCT 

ProLeuAlaAlaValGlyMetAspGlyLeuGluLysHisArglieuValSerArgThrHis 
1921 CCGCTGGCTGCGGTTGGCATGGATGGATTAGAGAAACACCGTCTCGTCTCCAGGACTCAC 

SerSerProAlaAlaSerValLeuProHisProAlaMetAspArgProLeuGlnProGly 
1981 TCTTCCCCTGCTGCCTCTGTTTTACCTCACCCAGCAATGGACCGCCCCCTCCAGCCTGGC 

. SerAlaThrGlylleAlaTyrAspProLeuMetLeuLysHisGlnCysValCysGlyAsn 
2041 TCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAGTGCGTTTGTGGCAAT 

SerThrThrHisProGluHisAlaGlyArglleGlnSerlleTrpSerArgLeuGlnGlu 
2101 TCCACCACCCACCCTGAGCATGCTGGACGAATACAGAGTATCTGGTCACGACTGCAAGAA 

TlirGlyLeuLeuAsnLysCysGluArglleGlnGlyArgLysAlaSerLeuGluGluIle 
2161 ACTGGGCTGCTAAATAAATGTGAGCGAATTCAAGGTCGAAAAGCCAGCCTGGAGGAAATA 

GlnLeuValHisSerGluHisHisSerLeuLeuTyrGlyThrAsnProLeviAspGlyGln 
2221 CAGCTTGTTCATTCTGAACATCACTCACTGTTGTATGGCACC AACCCCCTGGACGGACAG 

LysLeaxAspProArglleLeuLeuGlyAspAspSerGlnLysPhePheSerSerLeuPro 
2281 AAGCTGGACCCCAGGATACTCCTAGGTGATGACTCTCAAAAGTTTTTTTCCTCATTACCT 
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CysGlyGlyLeuGlyValAspSerAspThrlleTrpAsnGluLeuHisSerSerGlyAla 
2341 TGTGGTGGACTTGGGGTGGACAGTGACACCATTTGGAATGAGCTACACTCGTCCGGTGCT 

AlaArgMetAlaValGlyCysVallleGluLeuAlaSerLysValAlaSerGlyGluLeu 
2401 GCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGCTTCC AAAGTGGCCTC AGGAGAGCTG 

LysAsnGlyPheAlaValValArgProProGlyHisHisAlaGluGluSerThrAlaMet 
2461 AAGAATGGGTTTGCTGTTGTGAGGCCCCCTGGCCATCACGCTGAAGAATCCACAGCCATG 

GlyPheCysPhePheAsnSerValAlalleThrAlaLysTyrLeuArgAspGlnLeuAsn 
2521 GGGTTCTGCTTTTTTAATTCAGTTGCAATTACCGCCAAATACTTGAGAGACCAACTAAAT 

IleSerLysIleLeuIleValAspLeuAspValHisHisGlyAsnGlyThrGlnGlnAla 
2581 ATAAGCAAGATATTGATTGTAGATCTGGATGTTCACCATGGAAACGGTACCCAGCAGGCC 

PheTyrAlaAspProSerlleLeuTyrlleSerLeuHisArgTyrAspGluGlyAsnPhe 
2641 TTTTATGCTGACCCCAGC ATCCTGTACATTTCACTCCATCGCTATGATGAAGGGAACTTT 

PheProGlySerGlyAlaProAsnGluValArgPhelleSerLeuGluProHisPheTyr 
2701 TTCCCTGGCAGTGGAGCCCCAAATGAGGTTCGGTTTATTTCTTTAGAGCCCCACTTTTAT 

LeuTyrLeuSerGlyAsnCysIleAla* * * 
2761 TTGTATCTTTCAGGTAATTGCATTGCATGA 
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1 GGGGAAGAGAGGCACAGACACAGATAGGAGAAGGGCACCGGCTGGAGCCACTTGCAGGAC 
61 TGAGGGTTTTTGCAACAAAACCCTAGCAGCCTGAAGAACTCTAAGCCAGATGGGGTGGCT 

MetHisSerMetlleSerSerValAspVal 
121 GGACGAGAGCAGCTCTTGGCTCAGCAAAGAATGCACAGTATGATCAGCTCAGTGGATGTG 

LysSerGluValProValGlyLeuGluProIleSerProLeuAspLeuArgThrAspLeu 
181 AAGTCAGAAGTTCCTGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTC 

ArgMetMetMetProValValAspProValValArgGluLysGlnLeuGlnGlnGluLeu 
241 AGGATGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGC AGGAATTA 

LeuLeuIleGlnGlnGlnGlnGlnlleGlnLysGlnLeuLeuIleAlaGluPheGlnLys 
301 CTTCTTATCCAGCAGCAGCAACAAATCC AGAAGCAGCTTCTGATAGCAGAGTTTCAGAAA 

GlnHisGl\iAsnLeuThrArgGlnHisGlnAlaGlnLeuGlnGluHisIleLysGluLeu 
361 CAGCATGAGAACTTGACACGGCAGCACCAGGCTCAGCTTCAGGAGCATATCAAGGAACTT 

LeuAlalleLysGlnGlnGlnGluLeuLeuGluLysGluGlnLysLeuGluGlnGlnArg 
421 CTAGCCATAAAACAGCAAC AAGAACTCCTAGAAAAGGAGCAGAAACTGGAGCAGCAGAGG 

GlnGluGlnGluValGluArgHisArgArgGluGlnGlnLeuProProLenArgGlyLys 
481 CAAGAACAGGAAGTAGAGAGGCATCGCAGAGAACAGCAGCTTCCTCCTCTCAGAGGCAAA 

AspArgGlyArgGluArgAlaValAlaSerThrGluValLysGlnLysLeuGlnGluPhe 
541 GATAGAGGACGAGAAAGGGCAGTGGCAAGTACAGAAGTAAAGCAGAAGCTTCAAGAGTTC 

LeuLeu S er Ly s S er Al aThr Ly s AspThr Pr oThr As nGlyLy s AsnHi s S er Va 1 S er 
601 CTACTGAGTAAATCAGCAACGAAAGACACTCCAACTAATGGAAAAAATCATTCCGTGAGC 

ArgHisProLysLeuTrpTyrThrAlaAlaHisHisThrSerLeuAspGlnSerSerPro 
661 CGCCATCCCAAGCTCTGGTACACGGCTGCCCACCACACATCATTGGATCAAAGCTCTCCA 

ProLeuSerGlyThrSerProSerTyrLysTyrThrLeuProGlyAlaGlnAspAlaLys 
721 CCCCTTAGTGGAAC ATCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAG 

AspAspPheProLeuArgLysThrAlaSerGluProAsnLeuLysValArgSerArgLeu 
781 GATGATTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCAGGTTA 

LysGlnLysValAlaGluArgArgSerSerProLeuLexxArgArgLysAspGlyAsnVal 
841 AAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGGAAGGATGGAAATGTT 

ValThrSerPheLysLysArgMetPheGluValThrGluSerSerValSerSerSerSer 
901 GTCACTTCATTCAAGAAGCGAATGTTTGAGGTGACAGAATCCTCAGTCAGTAGCAGTTCT 

ProGlySerGlyProSerSerProAsnAsnGlyProThrGlySerValThrGluAsnGlu 
961 CCAGGCTCTGGTCCCAGTTCACCAAACAATGGGCCAACTGGAAGTGTTACTGAAAATGAG 

ThrSerValLeuProProThrProHisAlaGluGlnMetValSerGlnGlnArglleLeu 
1021 ACTTCGGTTTTGCCC CCTACCCCTCATGCCGAGCAAATGGTTTCACAGCAACGCATTCTA 

IleHisGluAspSerMetAsnLeuLeuSerLeuTyrTlirSerProSerLeuProAsnlle 
1081 ATTCATGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCAACATT 

ThrLeuGlyLeuProAlaValProSerGlnLeuAsnAlaSerAsnSerLeuLysGluLys 
1141 ACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAATTCACTCAAAGAAAAG 
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GlnLysCysGluThrGlnThrLeuArgGlnGlyValProLeuProGlyGlnTyrGlyGly 
1201 CAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGTTCCTCTGCCTGGGCAGTATGGAGGC 

SerlleProAlaSerSerSerHisProHisValThrLeuGluGlyLysProProAsnSer 
1261 AGCATCCCGGCATCTTCCAGCCACCCTCATGTTACTTTAGAGGGAAAGCCACCCAACAGC 

SerHisGliiAlaLeuLeuGlnHisLeuLeuLeuLysGluGlnMetArgGlnGlnLysLeu 
1321 AGCCACCAGGCTCTCCTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTT 

LeuValAlaGlyGlyValProLetiHisProGlnSerProLeiiAlaThrLysGluArglle 
1381 CTTGTAGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGAGAATT 

SerProGlylleArgGlyThrHisLysLeuProArgHisArgProLeuAsnArgThrGln t 
1441 TCACCTGGCATTAGAGGTACCCACAAATTGCCCCGTCACAGACCCCTGAACCGAACCCAG 

SerAlaProLeuProGlnSerThrLeuAlaGlnLeuVallleGlnGlnGlnHisGlnGln 
1501 TCTGCACCTTTGCCTCAGAGC ACGTTGGCTCAGCTGGTCATTCAACAGCAACACCAGCAA 

PheLeuGluLysGlnLysGlnTyrGlnGlnGlnlleHisMetAsnLysLeuLeuSerLys 
1561 TTCTTGGAGAAGCAGAAGCAATACCAGC AGCAGATCCACATGAAC AAACTGCTTTCGAAA 

SerlleGluGlnLeuLysGlnProGlySerHisLeuGluGluAlaGluGluGluLeuGln 
1621 TCTATTGAACAACTGAAGCAACCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAG 

GlyAspGlnAlaMetGlnGliiAspArgAlaProSerSerGlyAsnSerThrArgSerAsp 
1681 GGGGACCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGC AAC AGCACTAGGAGCGAC 

SerSerAlaCysValAspAspTlirLeuGlyGlnValGlyAlaValLysValLysGluGlu 
1741 AGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTGAAGGTCAAGGAGGAA 

ProValAspSerAspGluAspAlaGlnlleGlnGluMetGluSerGlyGluGlnAlaAla 
1801 CCAGTGGAC AGTGATGAAGATGCTCAGATCCAGGAAATGGAATCTGGGGAGCAGGCTGCT 

PheMetGlnGlnVallleGlyLysAspLeuAlaProGlyPheVallleLysValllelle 
1861 TTTATGCAACAGGTAATAGGCAAAGATTTAGCTCCAGGATTTGTAATTAAAGTCATTATC 

*** 

1921 TGAACATGAAATGCATTGCAGGTTTGGTAAATGGATATGATTTCCTATCAGTTTATATTT 

1981 CTCTATGATTTGAGTTCAGTGTTTAAGGATTCTACCTAATGCAGATATATGTATATATCT 

2041 ATATAGAGGTCTTTCTATATACTGATCTCTATATAGATATCAATGTTTCATTGAAAATCC 

2101 ACTGGTAAGGAAATACCTGTTATACTAAAATTATGATACATAATATCTGAGCAGTTAATA 

2161 GGCTTTAAATTTATCCCAAAGCCTGCTACACCAATTACTTCTAAAGAAAACAAATTCACT 

2221 GTTATTTTGAGTTTATGTGTTGAGATCAGTGACTGCTGGATAGTCTCCCAGTCTGATCAA 

2281 TGAAGCATTCGATTAGTTTTTGATTTTTTGCAAC ATCTAGAATTTAATTTTCACATCACT 

2341 GTACATAATGTATC ATACTATAGTCTTGAACACTGTTAAAGGTAGTCTGCCCCTTCCTTC 

2401 CTCTCTCTTTTTTTAGTTAAGTAGAAATGTTCTGGTCACCATGCCAGTAGTCCTAGGTTA 
2461 , TTGTGTAGGTTGCAATTGAACATATTAGGAATACAGGTGGTTTTAAATATATAGATGCAA 

2521 ATTGCAGCACTACTTTAAATATTAGATTATGTCTCACATAGCACTGCTCATTTTACTTTT 

2581 ATTTTGTGTAATTTGATGAC ACTGTCTATCAAAAAAGAGCAAATGAAGCAGATGCAAATG 
2641 ' TTAGTGAGAAGTAATGTGC AGCATTATGGTCCAATCAGATACAATATTGTGTCTACAATT 

2701 GC AAAAAACACAGTAAC AGGATGAATATTATCTGATATCAAGTCAAAATC AGTTTGAAAA 

2761 GAAGGTGTATCATATTTTATATTGTCACTAGAATCTCTTAAGTATAATTCCATAATGACA 

2821 TGGGCATATACCGTAACATTCTGGCAAATAACAATTAGAAAAGATAGGTTTAACAAAAAA 

2881 ATTTACTTGTATATAATGCACCTTCAGGAGGACTATGTCCTTTGATGCTATAAAATACAA 

2941 ACAACTTTGAAGGCAAC AGAAGACACTGTTTATTCAAGTCAGTTCTTTGTC AGGTTCCTG 

3001 CTGTTCTCCTACAGAAAAGTGATTCTGTGAGGGTGAACAGGAAATGCCTTGTGGAAACAG 

3061 GAAGTCCAAGTGATTCATGTACTGAGGAATGTAGGAAAAAAAATCTGAGGATAGTGCTTT 
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3121 ACTCTTTCTGTTTTTAAAGGGCACTCTATGAATTGATTTATTGTCTAAGAAAATAACACC 

3181 ACAAGTAGGGAAATTGTTACGGAAGCTTTTC ACTGGAACATTTCCTTCATATTCCCTTTT 

3241 GATATGTTTACCTTGTTTTATAGGTTTACTTTTGTTAAGCTAGTTAAAGGTTCGTTGTAT 

3301 TAAGACCCCTTTAATATGGATAATCCAAATTGACCTAGAATCTTTGTGAGGTTTTTTCTA 

33 61 TTAAAATATTTATATTTCTAAATCCGAGGTATTTCAAGGTGTAGTATCCTATTTCAAAGG 

3421 AGATATAGCAGTTTTGCCAAATGTAGACATTGTTCAACTGTATGTTATTGGC ACGTGTTG 

3481 TTTACATTTTGCTGTGACATTTAAAAATATTTCTTTAAAAATGTTACTGCTAAAGATACA 

3541 TTATCCTTTTTTAAAAAGTCTCCATTCAAATTAAATTAACATAACTAGAAGTTAGAAAGT 

3 601 TTAAAAGTTTTCCACATAATGAAAGTCCTTCTGATAATTTGACAAATAGCTATAATAGGA 

3661 ACACTCCCTATCACCAACATATTTTGGTTAGTATATTCCTTCATATTAAAATGACTTTTT 

3721 GTC AGTTGTTTTGC ATTAAAAATATGGC ATGCCTAAGATAAAATTGTATATTTTTTCCAT 

3781 CTCATAAATATTCATTTTCTTCAAAGTCTTTTTTCAATCTCATAAAAAAGGGATAGTGCA 

3 841 TCTTTTAAAATACATTTTATTTGGGGAGGAACATGTGGCTGAGCAGACTTTTGTATAATA 

3901 TTACTTCAAAGATATGTAATC ACAAACAAAAAAAACTATTTTTTATAATGTCATTTGAGA 

3961 GAGTTTCATCAGTACAGTTGGTGGACGTTAATTGTTTGAATTTGATAGTCTTTGAATTTA 

4021 ATCAAGAAACTACCTGGAACCAGTGAAAAGGAAAGCTGGACTTAAATAATCTTAGAATTA 

4081 ATTGATAAATGTCTCTTTTAAAATCTACTGTATTTATTATAATTTACACCCTTGAAGGTG 

4141 ATCTCTTGTTTTGTGTTGTAAATATATTGTTTGTATGTTTCCC'TTCTTGCCTTCTGTTAT 

4201 AAGTCTCTTCCTTTCTCAAATAAAGTTTTTTTTAAAAG 
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51 
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GGGTTTTTGCAAC AAAAC CCTAGCAGC CTGAAGAACTCTAAGCC AGGTTT 



101 



150 



AATTGGTTTCTTTTTCTCGTGGGTAGACTTAATAATTTTCTACGTATTCT 



151 



200 



GACAAAgAAAT^CCCCBABGCACGTTCCTATTTCCCgCCTGCTTgTAGT 



GGGGAAGAGAGGCAC AGAC AC AGAT AGGAG AAGGGCAC CGGC TG 
GGGG AAGAGAGGC AC AG AC ACAG AT AGG AG AAGGGCAC CG GC TG 
GGGGAAG AGAGG CAC AG AC AC AGATAGG AG AAGGGCAC CGGC TG 



201 



GGGG AAG AGAG G CAC AG AC ACAG ATAGGAGAAGG G CAC CGGC TG 

250 



ttc3ggga5Jaacct0aactccagagagct2tagc2t i B2actctgt^^tt 

J 



gagccacttgcaggactgagggtttttgcaacaaaaccctagcagcctg 



gagccacttgcaggactgagggtttttgcaacaaaaccctagcagcctga 
gagccacttgcaggactgagggtttttgcaacaaaaccctagcagcctga 



gagccacttgcaggactgagggtttttgcaacaaaaccctagcagcctga 

251 300 



CTGCTgTGCgCA|j 



CAGATGGGGTGGCTGGACGAGAGCAGCTCTTGGCTCAG 



&GAACTCTAAGCCAGATGGGGTGGCTGGACGAGAGCAGCTCTTGGCTCAG 



AGAACTCTAAGCCAGATGGGGTGGCTGGACGAGAGCAGCTCTTGGCTCAG 



AG AAC TC T AAG C C AG ATGGGGTGG CTGG ACG AG AGC AGC T CTTGGC TC AG 



AGAACTCTAAGCCAGATGGGGTGGCTGGACGAGAGCAGCTCTTGGCTCAG 

* SPLICE JUNCTION: CAG»>ATG 
301 350 



CAAAGAATGCACAGTATGATCAGCTCAGTGGATGTGAAGTCAGAAGTTCC 
CAAAGAATGCACAGTATGATCAGCTCAGTGGATGTGAAGTCAGAAGTTCC 



CAAAGAATGCACAGTATGATCAGCTCAGTGGATGTGAAGTCAGAAGTTCC 



CAAAGAATGCACAGTATGATCAGCTCAGTGGATGTGAAGTCAGAAGTTCC 



CAAAGAATGCACAGTATGATCAGCTCAGTGGATGTGAAGTCAGAAGTTCC 
351 400 



TGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTCAGG, 
TGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTCAGGA 
TGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTCAGGA 
TGTGGGCCTGGAGCCCATCTCACCTTTAGACCTAAGGACAGACCTCAGGA 



TGTGGGC CTGGAGCCCATCTCACCTTTAGACCT AAGGAC AGACC TCAGGA 
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TGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGCAG 
rGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGCAG 
TGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGCAG 
TGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGCAG 



TGATGATGCCCGTGGTGGACCCTGTTGTCCGTGAGAAGCAATTGCAGCAG 
451 500 



gaattacttcttatccagcagcagcaacaaatccagaagcagcttctgat 
gaattacttcttatccagcagcagcaacaaatccagaagcagcttctgat; 
gaattacttcttatccagcagcagcaacaaatccagaagcagcttctgat; 
gaattacttcttatccagcagcagc aacaaatc c ag aagc agct t c tgat 



GAATTACTTCTTATCCAGCAGCAGCAACAAATCCAGAAGCAGCTTCTGAT 
501 550 



AGCAGAGTTTCAGAAACAGCATGAGAACTTGACACGGCAGCACCAGGCTC 



GC AG AGTTTC AGAAAC AG CATG AGAAC TTGAC AC GGC AGC AC CAGGCT 



AGCAGAGTTTCAGAAACAGCATGAGAACTTGACACGGCAGCACCAGGCTC 



GCAGAGTTTCAGAAACAGCATGAGAACTTGACACGGCAGCACCAGGCTC 



E 

AGCAGAGTTTCAGAAACAGCATGAGAACTTGACACGGCAGCACCAGGCTC 
551 600 



AGC TT C AGGAGC AT AT C AAG 


^^^^^EGAACTTCTAGCCATAAAACAGj 


AGCTTCAGGAGCATATCAAG 




GAACTTCTAGCCATAAAACAG 


AGCTTC AGGAGC AT ATC AAG 




GAACTTCTAGCCATAAAACAG 


AGCTTCAGGAGCATATCAAG 




GAACTTCTAGCCATAAAACAG 



AGCTTCAGGAGCATATCAAG 



GAACTTCTAGCCATAAAACAG 



*SPLICE ACCEPTOR I 

* SPLICE ACCEPTOR 2 



601 



650 



C AAC AAG AACTC C T AGAAAAGG AGC AGAAACTGGAG C AGC AG AGGC AAG 



CAACAAGAACTCCTAGAAAAGGAGCAGAAACTGGAGCAGCAGAGGCAAG 



C AAC AAG AACTC C T AGAAAAGGAGC AGAAAC TGG AGC AGCAGAGG C AAG A 
CAAC AAGAACTC C TAGAAAAGGAGC AGAAAC T GGAGC AGC AGAGGC AAGA 



CAACAAGAACTCCTAGAAAAGGAGCAGAAACTGGAGCAGCAGAGGCAAGA 
651 700 



AC AGG AAGTAGAG AGGC AT C GC AG AGAAC AGC AGC T T C CTC C TC TC AGAG 
CAGGAAGTAGAGAGGCATCGC AG AG AAC AGC AGCTTC CTCCTCTCAGAG 
C AGGAAG TAG AG AGGC AT C GC AG AGAAC AGC AG CTTCCTCC T CTC AG AG 

ACAGGAAGTAGAGAGGCATCGCAGAGAACAGCAGCTTCCTCCTCTCAGAG 



ACAGGAAGTAGAGAGGCATCGCAGAGAACAGCAGCTTCCTCCTCTCAGAG 
701 750 



GC AAAGA T AGAGG ACGAGAAAGGGC AGTGG C AAGTAC AG AAGT AAAG C AG 
'AG AGG ACG AG AAAGG G C AG TGGCAAG TACAG AAG TAAAG C AG 
GCAAAGATAGAGGACGAGAAAGGGCAGTGGCAAGTACAGAAGTAAAGCAG 
GCAAAGATAGAGGACGAGAAAGGGCAGTGGCAAGTACAGAAGTAAAGCAG 



GCAAAGATAGAGGACGAGAAAGGGCAGTGGCAAGTACAGAAGTAAAGCAG 
751 800 



AAGCTTCAAGAGTTCCTACTGAGTAAATCAGCAACGAAAGACACTCCAAC 
AAGCTTCAAGAGTTCCTACTGAGTAAATCAGCAACGAAAGACACTCCAAC 
AAGCTTCAAGAGTTCCTACTGAGTAAATCAGCAACGAAAGACACTCCAAC 
AAGCTTCAAG AGTT C C T ACTG AGT AAATC AGC AACGAAAGAC AC T C C AAC 



AAGCTTCAAGAGTTCCTACTGAGTAAATCAGCAACGAAAGACACTCCAAC 
801 850 



TAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTCTGGTACACGG 



TAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTCTGGTACACGG 



TAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTCTGGTACACGG 



TAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTCTGGTACACGG 



TAATGGAAAAAATCATTCCGTGAGCCGCCATCCCAAGCTCTGGTACACGG 
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CTGC C C ACCAC ACATC ATTGGATCAAAGC TCTC CACCC CTTAGTGGAACA 
901 950 



TCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAGGATGA 



TCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAGGATG 



TCTC C ATCCT AC AAGT AC AC AT T AC CAGGAG C AC AAGATGC AAAGGATG A 



TCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAGGATGA 



TCTCCATCCTACAAGTACACATTACCAGGAGCACAAGATGCAAAGGATGA 
951 1000 



TTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCA 
TTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCA 
TTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCC 



TTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCA 



TTTCCCCCTTCGAAAAACTGCCTCTGAGCCCAACTTGAAGGTGCGGTCCA 
1001 1050 



GGTTAAAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGG 
GGTTAAAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGG 
GGTTAAAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGG 
GGT T AAAAC AG AAAGTGGC AG AG AGG AG AAGC AG CCC C T TACTC AGG CGG 



GGTTAAAACAGAAAGTGGCAGAGAGGAGAAGCAGCCCCTTACTCAGGCGG 
1051 1100 



AAGGATGGAAATGTTGTCACTTCATTCAAGAAGCGAATGTTTGAGGTGAC 



GGATGGAAATGTTGTCACTTCATTCAAGAAGCGAATGTTTGAGGTGAC 



AAGGATGGAAATGTTGTCACTTCATTCAAGAAGCGAATGTTTGAGGTGAC 
AAGGATGGAAATGTTGTCACTTCATTCAAGAAGCGAATGTTTGAGGTGAC 



AAGGATGGAAATGTTGTCACTTCATTCAAGAAGCGAATGTTTGAGGTGAC 
1101 1150 



AGAATCCTCAGTCAGTAGCAGTTCTCCAGGCTCTGGTCCCAGTTCACCAA 
AGAATCCTCAGTCAGTAGCAGTTCTCCAGGCTCTGGTCCCAGTTCACCAA 
AGAATCCTCAGTCAGTAGCAGTTCTCCAGGCTCTGGTCCCAGTTCACCAA 
AGAATCCTCAGTCAGTAGCAGTTCTCCAGGCTCTGGTCCCAGTTCACCAA 



AGAATCCTC AGTC AGTAGC AGTTCTCCAGGC TCTGGTCC C AGTTC ACCAA 
1151 1200 



CTGAAAATGAGACTTCGGTTTTGCCC 



CAATGGGCCAACTGGAAGTGTTACTGAAAATGAGACTTCGGTTTTGCCC 



CAATGGGCCAACTGGAAGTGTTACTGAAAATGAGACTTCGGTTTTGCCC 



AC AATGGGC C AACTG G AAGTGTT AC TG AAAATG AG ACTT C GGTT TTGC CC 



CAATGGGCCAACTGGAAGTGTTACTGAAAATGAGACTTCGGTTTTGCCC 



AC AATGGGCCAACTGGAAGTGTTAC TGAAAATGAGACTTC GGTTTTGC CC 
1201 1250 



CCTACCCCTCATGCCGAGCAAATGGTTTCACAGCAACGCATTCTAATTCAj 
CCTACCCCTCATGCCGAGCAAATGGTTTCACAGCAACGCATTCTAATTCAl 

cctacccctcatgccgagcaaatggtttcacagcaacgcattctaattca 
cctacccctcatgccgagcaaatggtttcacagcaacgcattctaattca 
cctacccctcatgccgagcaaatggtttcacagcaacgcattctaattcaI 



CCTACCCCTCATGCCGAGCAAATGGTTTCACAGCAACGCATTCTAATTCA 
1251 1300 



TGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCA 
TGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCA 
TGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCA 
TG AAGATT C C ATGAAC C TGC T AAGTC TTTAT AC C TC TC C TTC TTTGC C C A 
TGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCA 



TGAAGATTCCATGAACCTGCTAAGTCTTTATACCTCTCCTTCTTTGCCCA 
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CATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAAT 
ACATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAAT 



CATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAAT 



CATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAAT 
CATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAAT 



ACATTACCTTGGGGCTTCCCGCAGTGCCATCCCAGCTCAATGCTTCGAAT 
1351 1400 



TCACTCAAAGAAAAGCAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGT 



T C AC T C AAAG AAAAGC AG AAGTGTG AGAC GC AGAC GCT TAGGC AAG GTGT 



TCACTCAAAGAAAAGCAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGT 



TCACTCAAAGAAAAGCAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGT 



TCACTCAAAGAAAAGCAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGT 



TCACTCAAAGAAAAGCAGAAGTGTGAGACGCAGACGCTTAGGCAAGGTGT 
1401 1450 

§ 



TCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCTTCCAGCCAC 



TCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCTTCCAGCCACC 
TCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCTTCCAGCCACC 



TCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCTTCCAGCCACG 



TCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCTTCCAGCCACC 



TCCTCTGCCTGGGCAGTATGGAGGCAGCATCCCGGCATCTTCCAGCCACC 
1451 1500 



CTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTC 
CTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTC 
CTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTC 
CTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTC 
CTCATGTTACTTTAGAGGGAAAGCCACCCAACAGCAGCCACCAGGCTCTC 



CTC ATGTTACTTTAGAGGGAAAGCC ACC CAAC AGCAGCCACC AGGCTC TC 
1501 1550 



CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGT 



CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGT 



CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGT 



CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGT 



CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGT 



CTGCAGCATTTATTATTGAAAGAACAAATGCGACAGCAAAAGCTTCTTGT 
1551 1600 

m 



GCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGA 



GCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGA 
IGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGA 
AGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGA 



AGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGA 



AGCTGGTGGAGTTCCCTTACATCCTCAGTCTCCCTTGGCAACAAAAGAGA 
1601 1650 



GAATTTCACCTGGCATTAGAGGTACCCACAAATTGCCCCGTCACAGACCC 
GAATTTCACCTGGCATTAGAGGTACCCACAAATTGCCCCGTCACAGACCC 
GAATTTCACCTGGCATTAGAGGTACCCACAAATTGCCCCGTCACAGACCC 
G AATTTC AC C TGGC ATT AG AGGT AC C C AC AAATTG C CCC GTC AC AG ACC C 
G AAT TTC AC C TGG C ATT AG AGGTAC C C AC AAATTG CC C C GTC AC AG ACC C I 



GAATTTCAC CTGGC ATTAGAGGTAC CCAC AAATTGC CCCGTCAC AGACC C 
1651 1700 



CTGAACCGAACCCAGTCTGCACCTTTGCCTCAGAGCACGTTGGCTCAGCT 
CT G AACCGAAC C C AGTC TGC AC C TTTGC C TC AG AGC AC GT TG GC T C AGC T 
CTGAACCGAACCCAGTCTGCACCTTTGCCTCAGAGCACGTTGGCTCAGCT 
C TG AAC C GAAC C C AGTC TGC AC CTTTGCCTCAGAGCACGT TGGC TC AGC T 
CTGAACCGAACCCAGTCTGCACCTTTGCCTCAGAGCACGTTGGCTCAGCT 



CTGAACCGAACCCAGTCTGCACCTTTGCCTCAGAGCACGTTGGCTCAGCT 
1701 1750 



GGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAGAAGCAATACC 
GGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAGAAGCAATACC 
GGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAGAAGCAATACC 
GGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAGAAGCAATACC 
GGTCATTCAACAGCAACACCAGCAATTCTTGGAGAAGCAGAAGCAATACC 



GGT CATT CAAC AGCAAC ACC AGC AATTC TTGGAG AAGC AGAAGCAAT AC C 
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jGAATgGCgl^TGAgCCCBTGgTGCTGA 



AGCAGCAGATCCACATGAACAAACTGCTTTCGAAATCTATTGAACAACTG 



AGCAGCAGATCCACATGAACAAA 



AGCAGCAGATCCACATGAACAAACTGCTTTCGAAATCTATTGAACAACTG 



&GCAGCAGATCCACATGAACAAACTGCTTTCGAAATCTATTGAACAACTG 



AGC AGC AGATC CACATG AAC AAAC TGCTTT CGAAATCT ATTGAAC AACTG 



AGCAGCAGATCCACATGAACAAACTGCTTTCGAAATCTATTGAACAACTG 

* SPLICE JUNCTION: 
CAAA»GAAA OR CTGC 
1801 .1850 



AAG C AACC AGGC AGTC ACCTTGAGG AAGC AGAGGAAGAGC T TC AGGGGGA 



2ACCAGT^GTT^TGGCAATTCC0C^CCCACCCT@AGCATgCTS^ 



AAGCAACCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAGGGGGA 



GCAACCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAGGGGGA 



AAGCAACCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAGGGGGA 



AAGCAACCAGGCAGTCACCTTGAGGAAGCAGAGGAAGAGCTTCAGGGGGA 
1851 1900 



CCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGCAACAGCACTAGGA 



^J^T ACraGAGTATCT^TC ACUgAC TGSAAGgAACTGGGgT^T AAgT A0 



CCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGCAACAGCACTAGGA 
C C AGGC G ATGC AGG AAGACAG AGC GC CC TCT AG TGGC AAC AGC AC T AGG A 
CCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGCAACAGCACTAGGA 



CCAGGCGATGCAGGAAGACAGAGCGCCCTCTAGTGGCAACAGCACTAGGA 
1901 1950 



GCGACAGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTG 



Gg gAGC 

GCGACAGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTG 
GCGACAGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTG 
GCGACAGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTG 



GCGACAGCAGTGCTTGTGTGGATGACACACTGGGACAAGTTGGGGCTGTG 
1951 2000 



AAGGTCAAGGAGGAACCAGTGGACAGTGATGAAGATGCTCAGATCCAGGA 



TTCA^CTgAgCATCAgTCACTGTTgjATGigCgcCAAgcgCCTGGAq^g 



GGTC AAGGAGGAAC C AGTGGAC AGTGATGAAGATGCTC AG AT C C AGGA 
AAGGTC AAGGAGGAAC CAGTGGACAGTGATGAAGATGCTC AGATC CAGGA 
AAGGT C AAGGAGGAAC C AGTGGAC AGTGAT GAAGAT GCT C AG ATC C AGG A 



AAGGTC AAGGAGGAAC CAGTGGACAGTGATGAAGATGCTC AGATC CAGGA 
2001 2050 



AATGGAATCTGGGGAGCAGGCTGC TTTTATGCAACAGCCTTTCCTGGAAC 
TGGAATCTGGGGAGCAGGCTGCTTTTATGCAACAGCCTTTCCTGGAAC 



TGGAATCTGGGGAGCAGGCTGCTTTTATGCAACAGCCTTTCCTGGAAC 



AATGGAATCTGGGGAGCAGGCTGCTTTTATGCAACAq 



CgGAAGC jj3GACCCC AGGATAC TgCfjAG 
^^^^^^^ 

[GTAAgjAGGCAggG 
AATGGAATCTGGGGAGCAGGCTGCTTTTATGCAACAGCCTTTCCTGGAAC 

* SPLICE JUNCTION: 
CAG»>CCT OR GTA 
2051 ( 2100 



CCACGCACACACGT GCGCTCTCTGTGCGCCAAGCTCCGCTGGCTGCGG TT 
GgG^T^^^TG ^G^tftfeMC^W^ G ^ y^ijC ^ttgii'G 

CCACGCACACACGTGCGCTCTCTGTGCGCCAAGCTCCGCTGGCTGCGGTT 
CCACGCACACACGTGCGCTCTCTGTGCGCCAAGCTCCGCTGGCTGCGGTT 



ATTTAGCTC^G^TTTGjSAATgAAAGTgATTATCTGAACATGAAATgCA 
CCACGCACACACGTGCGCTCTCTGTGCGCCAAGCTCCGCTGGCTGCGGTT 
2101 2150 



GGCATGGATGGATTAGAGAAACACCGTCTCGTCTCCAGGACTCACTCTTC 



ATGSGCTgC^CTCGTCCgGTGaTG^CGCATGGCTGTTgGi^TG^BA^ 



ggcatggatggattagagaaacaccgtctcgtctccaggactcactcttc 
ggcatggatggattagagaaacaccgtctcgtctccaggactcactcttc 
ggHtHgSB 



|T0T^0AgAT@G3TATGATBTC3TATCAGTTTgTATTTCTqjA 

GGCATGGATGGATTAGAGAAACACCGTCTCGTCTCCAGGACTCACTCTTC 
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2151 2200 

GAGCTGGCTTCBAAAGgGG^^^GGAG AGCTGAAGG'igAGGTgCGGGTT 

TGAjjTTGAGTEgAgjGjSJrAAGGgiTTgTABCTAATGgASATATABGTSTA 

CCCTGCTGCCTCTGTTTTACCTCACCC GCAATGGACCGCCCCCTCCAGC 
2201 2250 



CTGGCTCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAG 



GCATTAAGTGTGGGAA^TCCAGAG^A^AAgTGAAgCAGA^gTGTTGTTA 



CTGGCTCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAG 
CTGGCTCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAG 



TATCTATATAG§GGTCTT@CTATATACTGATggCTg5jATA^TgTCAATg 
CTGGCTCTGCAACTGGAATTGCCTATGACCCCTTGATGCTGAAACACCAG 
2251 2300 



TGCGTTTGTGGCAATTCCACCACCCACCCTGAGCATGCTGGACGAATACA 



EETgGGAABT^GGGGAGTGTGGfflGTGGTAATAAaABGAABGGCgGAgGG 



TGCGTTTGTGGCAATTCCACCACCCACCCTGAGCATGCTGGACGAATACA 



TGCGTTTGTGGCAATTCCACCACCCACCCTGAGCATGCTGGACGAATACA 



j^TCAjJgAAAATCCAgTGGTgAGGgAATACCTGTQATACTgAAgT^TG 

TGCGTTTGTGGC AATTC CACCAC CCACC CTGAGC ATGCTGGACGAATACA 
2301 2350 



GAGTATCTGGTCACGACTGCAAGAAACTGGGCTGCTAAATAAATGTGAGC 



A^^GGGBA^GgTBGBCA^^GGTG^l^TAATAACTCATCTGTABGCA 



GAGTATCTGGTCACGACTGCAAGAAACTGGGCTGCTAAATAAATGTGAGC 



GAGTATCTGGTCACGACTGCAAGAAACTGGGCTGCTAAATAAATGTGAGC 



ATAC^AATi^TGAGgA^T^TgGGCTTTAAATjJrgTCCCgAAGCCTG 
GAGTATCTGGTCACGACTGCAAGAAACTGGGCTGCTAAATAAATGTGAGC 
2351 2400 

SAGp^GCTCATCCTGCTgTCAGGEcCTTCTTC'T^C^^r 



CTgCA^^T^CTTCTAAAGAAAACA^TTCgCTGTTA^T^GAGyj^A 
GAATTC AAGGTCGAAAAGC C AGC CTGGAGGAAAT AC AGCTTGTTC ATTCT 
2401 2450 



G AAC ATC ACTC AC TGTT GT ATGG C AC C AAC C C C C TGGACGG AC AG AAG CT 



Ecj^TCAGGGffiifoCCGggfcg^ 



tgtgt^g^gatcagtgacQgct^g^tagtHt^^aQtct^at^atgaag 

gaacatcactcactgttgtatggcaccaaccccctggacggacagaagct 

2451 2500 



GGACCCCAGGATACTCCTAGGTGATGACTCTCAAAAGTTTTTTTCCTCAT 



ABCTATGGTTCAGG^AAgAA^CCTTTAG^T^gCA^BcBGTBATGC 



ggaccccaggatactcctaggtgatgactctcaaaagtttttttcctcat 
ggaccccaggatactcctaggtgatgactc tcaaaagttt ttttcctcat 

tcHaK&ESttt 



AAgijTTCACA 

2550 



CATT^ATTAG^TgTGATTTqTTGCgACAgpT^ 
2501 

:EH8cgSlAgA^CTB^T^GEc^TTBTSABAATA^ATTgCTGAGA_ 



£25 



2CAg^TACA2^T(^TCATACT^^^TTGAACACj3iTT3AAGSTAG 

TACCTTGTGGTGGACTTGGGGTGGACAGTGACACCATTTGGAATGAGCTA 
2551 2600 



CACTCGTCCGGTGCTGCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGC 



GTGBGTATAAgCTBAgAgTAgCACCgABAGAgAGEGAGGCAGAGAAA^T 



CACTCGTCCGGTGCTGCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGC 
CACTCGTCCGGTGCTGCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGC 
C 



TCTGgCC^TCCTTCCTCTCTC^TT^gA^TAAGTAGaAATGTTCTgG 
CACTCGTCCGGTGCTGCACGCATGGCTGTTGGCTGTGTCATCGAGCTGGC 
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2601 2650 
AAAT2cCAGACGGGA^^TTgGGAGGAGGAAggAAATTGTj5GAgTggAA 



TTCCAAAGTGGCCTC AGGAGAGCTGAAGAATGGGTTT GCT GTTGT GAGGC 
AGnArilifc'Ggrcy-wey^tfcy^GG 



TTCCAAAGTGGCCTCAGGAGAGCTGAAGAATGGGTTTGCTGTTGTGAGGC 
TTCCAAAGTGGCCTCAGGAGAGCTGAAGAATGGGTTTGCTGTTGTGAGGC 



^A3cgTgCCAGTAGTCCTAG^Ti^TTjgTGgA|^g^CAATjSGAAC2TAT 

TTCCAAAGTGGCCTCAGGAGAGCTGAAGAATGGGTTTGCTGTTGTGAGGC 
2651 2700. 



CCCCTGGCCATCACGCTGAAGAATCCACAGCCATGGGGTTCTGCTTTTTT 



ffl^TAAT.GATCpEfcAGWSfax^TTTfo 

— 



CCCCTGGCCATCACGCTGAAGAATCCACAGCCATGGGGTTCTGCTTTTTT 



TAGGAATA^GGTGgTETTgAgTATATAGATGCAAATTGCAGCACgACHJ 
CCCCTGGCCATCACGCTGAAGAATCCACAGCCATGGGGTTCTGCTTTTTT 
2701 2750 



AAT TC AGTTGC AATTAC C GC C AAATACT TG AGAGAC C AAC TAAATAT AAG 



C 



A ATTCAGTTGCAATTACCGCCAAATACTTGAGAGACCAACTAAATATAAG 
TTCAGTTGCAATTACCGCCAAATACTTGAGAGACCAACTAAATATAAG 



EES 



TgAATgTgA^TTABGTgT^C^AG^CTGCTC^TTTACTTTg^TTT 

AATTC AGTTGC AATTACCGCCAAATACTTGAGAGAC C AAC T AAATATAAG 
2751 2800 

T(^A^GAAC^f ^ CCBAACCCC^^GTTTBcCA|gTCT AAACg AT TTAT 
GTGT^T gTOg^ GACBCTGTCTATCAAA^ 

C AAGAT ATTGATTGTAGATCTGGATGT TCACC ATGGAAACGGTAC C C AGC 
2801 2850 



GGCCTTTTATGCTGACCCCAGCATCCTGTACATTTCACTCCATCGCTAT 



gAAAGC^G^AgCTGA^GgCGT^ACA^TTA^^GGgATT^AAC^ 



GGCCTTTTATGCTGACCCCAGCATCCTGTACATTTCACTCCATCGCTAT 



^GTGAGAAGTAATGTGCAGgATggTGGgC^AiBAGATAgAgj 

AGGCCTTTTATGCTGACCCCAGCATCCTGTACATTTCACTCCATCGCTAT 
2851 2900 



GATGAAGGGAACTTTTTC CCTGGCAGTGGAGC C CC AAATGAGGTT 



VCGEgBAffggTGCgnCTTBgAATGCgGBTgCCCCA ATg 

ATj^GTCTgCAA^GCAAAAAA^CAgTgABAGGSTGAATAT0ATCTGA 

GATGAAGGGAACTTTTTCCCTGGCAGTGGAGCCCCAAATGAGGTT G A 





2901 

^GGSCTM^T^MSGGT^gAA 

jgcgTTGE ^GTGCT AACESGG C CgT TTAgBc 

3GGT ^ AA gggg A g T ^g C j 

:CCC^TTjjTATTTGTA0C f 
^GTTTGAgAgC^AGGjETi 

ttgagaa ac tata a attg ct 
IccatgS^gStgttgaStacSttgSag 

/TCT^GTCGTGATTgATT§GGTgTTTT 
icCATG^AGSTGTTGA§TAC 

CATGA : 

CAgTAGAATCTCTTAgG TAT AATTC CAT AATG AC ^TGGG'SaJJa 

CC C GGA gca t a cgt 

3001 3050 

aSgSc^tggccSaaSa^tEg^ccagSc^ggtcEEaSJaQcJSSctgg 

T T A G GG AA AT ATEftTT AT|c}AC CAlffiAAC AT ATGC AC TGnG AGTRtEScItGAA 
Aj^C^TGGCC^AASA^TEJS^^ 

tQc^gJJaacatt c tg^caaaQaQcaatt^g^a^g a^g^^JJaac aaaa 
act aggtat a a ttgtttg 



2950 

VCAGG^G ^ c gggg 

?CACACAgTgCA 

:atatt^^™^Jgt 

G T GC TTG 

3000 

rCAGGgCCATgsS 6 
VGCAATTGCGGgGC 
VGGgcCATgSgG 
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3051 3100 

2TTj^ATGCST^G2AGGCC2c?^CS^H c EE G SS ( ^ TAC ^ AG i^ 
gCCAAGAT^AATA^TAGGSTTA^TQTSiST^TgTCTAGTG^TTQTT 

^TaGATGCgTBGGgAGGCCSCAjSC^cSjjcJEG^GgGTAC 

gAA0TTACTTGgATgTAATGCACgTTgAGGAGGSCTgT@TCCTTTGAggC 
AT ATA A CC CT CT TA GA G AA TG 

310 1 _ 3150 

CGG G^^^A TGTRrTBGTCABSSTGScSA^GE^AnT^TGACAlEGBcj^BiT 
AT TlgSI^ TACAEfcG^CTCCTcC 

TATAgg2TACAAACAACT-SJGASGBcgA^GAAS CAC TGS3TAT|icgA 

CAAA T G TT A G A CA T GA TT G TGA 

3151 3200 

^ACG^TGGT^^G^CgCTASAAGSAGG^ABGA ^j cgCAGjgCAgEE 
ATGTECACTTCACAATGCAAC§TTTaTCCBAGAAGZg3TTACTgTT AACT 

51g AcSSfa TGGS3c53rcffic t aSaagSIag g ET^J3g aSTO/Ac flc agS c aTEiS 

@TCA^5TCTTjSiCAgGj3TCCTGCTgTTCTBcBACAGAAgAGTGATjHS 
G GT TGT G T G G AC T TCT A C TCTG 
3201 3250 
CSEP8B c^TCE! g^^8 c CTG5^TA A^?8 cCCTTCHAGG^by3T5^^CTBGAGC 
CTTgp^ GAGTC^AGCC^CGGgAATATAATgTTSr|GgGTGA§CTCT 

i^S3 c S TC S G ^^ CT ^ GTAA SE CCCTTC ii AG ^^ T ^S CT ^ AGC 

J^G^TGA^gc^[AAATGCCTTG0gGAAACAGGAAGTCCAgSTGATTCA 
TGATG A A AAG T ATG T GA A GAG G 

3251 3300 

CAC^GC^gAgGATATgCTgC^C^AGCCC^SiETGAAHgCTGSTAjJJ 

att0aaa^agtagatg^c§tgtat0tatttScataagta@tat0agga 

CAC^GC^AgGATATgCTgC^CS^GCCC^ESiETGAA^CTGgjTAgJ 

TGTACTG^G^TGTAGGAAASAA^TCTGAgGSBGTGCHTTACBCTgB 
T AG A T C A AA GAATA TG T TT 

3301 3350 

TCS^CAt^^ATgTgGA^TgCAAAGC^GHA^GG^E^^GTAA 6 
CAESGCTCATCTCAGGGGATgTABGQ 
TCjJJjACAGgSSAT^TgGASAlQCA 

CTGj53TTTAg^GG^CgCTgTGAATTGATTjjAj}rG0CT^EAAAATAAC 

TTT AAG CA TAT AT T TT AAG 

3351 3400 
GATGGTGGCTGTGCCAAGGGGClgTGCTCTGGCTGGTGCTCAGTTGCAAG 

TCAGTCTTTACCTTTGAAAATGAgCAAAAAAAAAAAAAAAA 

A 

ACCACAAGTAGGGAAATTGTTACgGAAGCTTTTCACTGGAACATTTCCTT 

G 

3401 3450 
AGGAGACAGAGACCGTTTCTGCCCTGGCCTCCCTAACAGTGGATGTGGAA 

CATATTCCCTTTTGATATGTTTACCTTGTTTTATAGGTTTACTTTTGTTA 

3451 3500 
CAGCCCTTTGCTCAGGAAGACAGCAGAACTGCTGGTGAGCCTATGGAAGA 

AGCTAGTTAAAGGTTCGTTGTATTAAGACCCCTTTAATATGGATAATCCA 
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HDAC9 POLYPEPTIDES AND POLYNUCLEOTIDES AND USES THEREOF 

RELATED APPLICATIONS 

This application claims the benefit of U.S. Provisional Application No. 
5 60/298,173 filed on June 14, 2001, U.S. Provisional Application No. 60/3 1 1,686 
filed on August 10, 2001, and U.S. Provisional Application No. 60/316,995, filed on 
September 4, 2001. The entire teachings of the above applications are incorporated 
herein by reference. 

10 GOVERNMENT SUPPORT 

The invention was supported, in whole or in part, by grant CA-0974823 from 
the National Cancer Institute. The Government has certain rights in the invention. 

BACKGROUND OF THE INVENTION 

1 5 The N-terminal tails of core histones are covalently modified by post- 

translational modifications, including acetylation and phosphorylation. Evidence 
suggests that these covalent modifications play important roles in several biological 
activities involving chromatin, e.g., transcription and replication, ffistone 
deacetylases (HDACs) catalyze the removal of the acetyl group from the lysine 

20 residues in the N-terminal tails of nucleosomal core histones resulting in a more 
compact chromatin structure, a configuration that is generally associated with 
repression of transcription. 

Five proteins and/or open reading frames in yeast (RPD3, HDA1, HOS1, 
HOS2 and HOS3) that share significant homology in the catalytic domain have been 

25 identified as HDACs based upon their sequence homology to human HDAC1 . To 
date, eight HDACs have been identified in mammalian cells, and classified into two 
classes based on their structure and similarity to yeast RPD3 or HDA1 proteins. 
Recently, Sir2 family proteins that are structurally unrelated to the five proteins 
aforementioned have been identified as NAD-dependent HDACs. Class I HDACs 

30 are the yeast RPD3 homologs HDAC1, 2, 3, and 8, and are composed primarily of a 
catalytic domain. Class II HDACs are the yeast HDA1 homologs HDAC4, 5, 6; and 
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7. HDAC4, 5, and 7 contain a long non-catalytic N-terminal end and a C-tenninal 
HP AC catalytic domain while HDAC6 has two HDAC catalytic domains. 

It has also been determined that histone deacetylases can be sensitive to 
small molecules, including trichostatin A (TSA), trapoxin, and butyrate. For 
5 example, the yeast RPD3 and HDA1 and mammalian HDAC1, 2, 3, 4, 5, 6, 7 and 8 
are sensitive to inhibition by trichostatin A (TSA). The Sir2 family HDACs, yeast 
HOS3 and Drosophila melanogaster dHDAC6, however, appear to be relatively 
insensitive to TSA. A class of hybrid bipolar compounds, such as suberoylanilide 
hydroxamic acid (SAHA) have also been shown to inhibit histone deacetylases and 

10 induce terminal differentiation and/or apoptosis in various transformed cells. 

Examples of such compounds can be found in U.S. Patent Nos. 5,369,108, issued on 
November 29, 1994, 5,700,81 1, issued on December 23, 1997, and 5,773,474, issued 
on June 30, 1998 to Breslow et ah, as well as U.S. Patent Nos. 5,055,608, issued on 
October 8, 1991, and 5,175,191, issued on December 29, 1992 to Marks et aL 9 the 

1 5 entire content of all of which are hereby incorporated by reference. 

The identification of the mechanisms by which histones are deacetylated, and 
the characterization of histone deacetylase function would be of great benefit in 
understanding how gene transcription is controlled, how the cell cycle is regulated, 
and how cells are signaled to undergo terminal differentiation and/or apoptosis. 

20 Elucidation of such mechanisms can lead to improved therapeutics for many 
diseases, in particular those characterized by cell proliferation or a lack of cell 
differentiation or apoptosis, for example, cancer. 

SUMMARY OF THE INVENTION 
25 The present invention relates to isolated or recombinant histone deacetylase 

polypeptides, and isolated histone deacetylase nucleic acid molecules encoding those 
polypeptides, as well as vectors and cells containing those isolated nucleic acid 
molecules. 

In one aspect of the invention, the isolated or recombinant histone 
30 deacetylase polypeptide is selected from a) an isolated or recombinant polypeptide 
comprising SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ 
ID NO: 10; and b) a polypeptide having at least 60% sequence identity with any one 
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of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 
10. In one embodiment, the isolated or recombinant histone deacetylase polypeptide 
consists of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ 
ID NO: 10. In another embodiment, the isolated or recombinant histone deacetylase 
5 polypeptide is mammalian; preferably, the isolated or recombinant histone 
deacetylase polypeptide is human. 

hi another aspect, the invention features an isolated nucleic acid molecule 
selected from a) an isolated nucleic acid comprising SEQ ID NO: 1, SEQ ID NO: 3, 
SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9; b) a complement of an isolated 

10 nucleic acid comprising SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 
NO: 7, or SEQ ID NO: 9; c) an isolated nucleic acid encoding a histone deacetylase 
polypeptide of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or 
SEQ ID NO: 10; d) a complement of an isolated nucleic acid encoding a histone 
deacetylase polypeptide of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID 

15 NO: 8, or SEQ ID NO: 10; e) a nucleic acid that is hybridizeable under high 

stringency conditions to a nucleic acid molecule that encodes any of SEQ ID NO: 2, 
SEQ ID NO: 4, SEQ ID NO: 6, or SEQ ID NO: 8, or a complement thereof; or f) a 
nucleic acid molecule that is hybridizeable under high stringency conditions to a 
nucleic acid comprising SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID 

20 NO: 7; and g) an isolated nucleic acid molecule that has at least 55% sequence 
identity with any one of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 
NO: 7, SEQ ID NO: 9, or a complement thereof. In one embodiment, the isolated 
nucleic acid molecule consists of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, 
SEQ ID NO: 7, or SEQ ID NO: 9. In another embodiment, the isolated nucleic acid 

25 molecule is mammalian; preferably, the isolated nucleic acid molecule is human. 
In other aspects, the invention features a vector comprising the isolated 
histone deacetylase nucleic acid molecule described above, a cell comprising the 
vector, and a cell comprising the isolated histone deacetylase nucleic acid molecule 
described above. 

30 In another aspect, the invention features a purified antibody that selectively 

binds a histone deacetylase polypeptide described above. 
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In yet another aspect, the invention features a method of identifying a 
compound that modulates expression of a histone deacetylase nucleic acid molecule 
described above. The method comprises the steps of a) contacting the nucleic acid 
molecule with a candidate compound under conditions suitable for expression; and 
5 b) assessing the level of expression of the nucleic acid molecule. A candidate 
compound that increases or decreases expression of the nucleic acid molecule 
relative to a control is a compound that modulates expression of the nucleic acid 
molecule. In one embodiment, the method is carried out in a cell or animal. In 
another embodiment, the method is carried out in a cell free system. 

10 The invention also features a method of treating a cell proliferation disease, 

an apoptotic disease, or a cell differentiation disease, for example, cancers such as 
lymphoma, leukemia, melanoma, ovarian cancer, breast cancer, pancreatic cancer, 
prostate cancer, colon cancer, and lung cancer and myeloproliferative disorders, 
including polycythemia vera, essential thrombocythemia, agnogenic myeloid 

15 metaplasia, and chronic myelogenous leukemia in an individual, comprising 
administering a compound identified by the above method. 

In still another aspect, the invention features a method of identifying a 
compound that modulates the enzymatic activity of the histone deacetylase 
polypeptide described above. The method comprises the steps of a) contacting the 

20 polypeptide with a candidate compound under conditions suitable for enzymatic 
reaction; and b) assessing the activity level of the polypeptide. A candidate 
compound that increases or decreases the activity level of the polypeptide relative to 
a control is a compound that modulates the enzymatic activity of the polypeptide. In 
one embodiment, the method is carried out in a cell or animal. In another 

25 embodiment, the method is carried out in a cell free system. 

In yet another embodiment, the polypeptide is further contacted with a 
substrate for the polypeptide, wherein the substrate is selected from the group 
consisting of a cell proliferation disease binding agent, an apoptotic disease binding 
agent, and a cell differentiation disease binding agent. In one embodiment, the 

30 candidate compound is an inhibitor. In another embodiment, candidate compound is 
an activator. 
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In another aspect, the invention features a method of identifying a compound 
that modulates the transcriptional repression activity of the histone deacetylase 
polypeptide described above. The method comprises the steps of a) contacting the 
polypeptide with a candidate compound under conditions suitable for a 
5 transcriptional repression reaction; and b) assessing the transcriptional repression 
activity level of the polypeptide. A candidate compound that increases or decreases 
the transcriptional repression activity level of the polypeptide relative to a control is 
a compound that modulates the transcriptional repression activity of the polypeptide. 
In one embodiment, the method is carried out in a cell or animal. In another 

10 embodiment, the method is carried out in a cell free system. 

In yet another embodiment, the polypeptide is further contacted with a 
substrate for the polypeptide, wherein the substrate is selected from the group 
consisting of a cell proliferation disease binding agent, an apoptotic disease binding 
agent, and a cell differentiation disease binding agent. In one embodiment, the 

15 candidate compound is an inhibitor. In another embodiment, candidate compound is 
an activator. 

In another aspect, the invention features a method of identifying a compound 
that modulates expression of a histone deacetylase nucleic acid molecule described 
above. The method comprises the steps of a) providing a nucleic acid molecule 

20 comprising a promoter region of the histone deacetylase nucleic acid molecule 
described above, or part of such a promoter region, operably linked to a reporter 
gene; b) contacting the nucleic acid molecule or with a candidate compound; and c) 
assessing the level of the reporter gene. A candidate compound that increases or 
decreases expression of the reporter gene relative to a control is a compound that 

25 modulates expression of the histone deacetylase nucleic acid molecule described 
above. In one embodiment, the method is carried out in a cell. 

In still another aspect, the invention features a method of identifying a 
polypeptide that interacts with a histone deacetylase polypeptide described above in 
a yeast two-hybrid system. The method comprises the steps of a) providing a first 

30 nucleic acid vector comprising a nucleic acid molecule encoding a DNA binding 
domain and the histone deacetylase polypeptide described above; b) providing a 
second nucleic acid vector comprising a nucleic acid encoding a transcription 
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activation domain and a nucleic acid encoding a test polypeptide; c) contacting the 
first nucleic acid vector with the second nucleic acid vector in a yeast two-hybrid 
system; and d) assessing transcriptional activation in the yeast two-hybrid system. 
An increase in transcriptional activation relative to a control indicates that the test 
5 polypeptide is a polypeptide that interacts with the histone deacetylase polypeptide 
described above. 

The invention also features a pharmaceutical composition comprising a 
histone deacetylase polypeptide described above. 

In addition, the present invention features a method of diagnosing a cell 

10 proliferation disease, an apoptotic disease, or a cell differentiation disease in a 

subject. The method comprises the steps of a) obtaining a sample from the subject; 
and b) assessing the level of activity or expression of the histone deacetylase 
polypeptide described above or the level of the nucleic acid molecule described 
above in the sample. If the level is increased relative to a control, then the subject 

15 has an increased likelihood of having a cell proliferation disease, an apoptotic 
disease, or a cell differentiation disease, and if the level is decreasdd relative to a 
control, then the subject has a decreased likelihood of having a cell proliferation 
disease, an apoptotic disease, or a cell differentiation disease. In one embodiment, 
the polypeptide level is assayed using immunohistochemistry techniques. In another 

20 embodiment, the nucleic acid molecule level is assayed using in situ hybridization 
techniques. 

Compounds and/or polypeptides identified in the above-described screening 
methods are also part of the present invention. 

25 DESCRIPTION OF THE FIGURES 

FIG. 1 is a schematic representation of the order in which FIGS. 1A-10 
should be viewed. 

FIGS. 1 A-1C show the cDNA sequence of HDAC9 (SEQ ID NO: 1). The 
arrows and numbers in the HDAC9 sequence indicate exons. The boxed portion of 
30 the sequence indicates the HDAC domain. 
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FIGS. 1D-1G show the cDNA sequence of HDAC9a (SEQ ID NO: 3). The 
arrows and numbers in the HDAC9a sequence indicate exons. The boxed portion of 
the sequence indicates the HDAC domain. 

FIGS. 1H-1I show the cDNA sequence of HDRP(ANLS) (SEQ ID NO:9). 
5 FIGS. 1 J- 1L show the cDNA sequence of HDAC9(ANLS) (SEQ ID NO:5). 

FIGS. 1M-10 show the cDNA sequence of HDAC9a(ANLS) (SEQ ID 

NO:7). 

FIG. 2 is a schematic representation of the order in which FIGS. 2A-2E 
should be viewed. 

10 FIG. 2A shows the amino acid sequence of HDAC9 (SEQ ID NO: 2). 

FIG. 2B shows the amino acid sequence of HDAC9a (SEQ ID NO: 4). 
FIG. 2C shows the amino acid sequence of HDAC9(ANLS) (SEQ ID NO: 6). 
FIG. 2D shows the amino acid sequence of HDAC9a(ANLS) (SEQ ID NO: 

8). 

15 FIG. 2E shows the amino acid sequence of and HDRP(ANLS) (SEQ ID NO: 

10) . 

FIG. 3 is a schematic representation of the order in which FIGS. 3A-3C 
should be viewed. 

FIGS. 3A-3C show an amino acid sequence alignment of HDRP (SEQ ID 
20 NO: 1 1), HDAC9 (SEQ ID NO: 2), HDAC9a (SEQ ID NO: 4), and HDAC4 (SEQ 
ID NO: 12) polypeptides. Amino acid sequences of HDAC9 (GenBank Accession: 
AY032737; SEQ ID NO: 2) and HDAC9a (GenBank Accession:AY032738; SEQ 
ID NO: 4) are aligned with HDRP (GenBank Accession: BAA34464; SEQ D NO: 

1 1) and HDAC4 (GenBank Accession: NP_006028; SEQ ID NO: 12). The identical 
25 residues in all proteins are boxed with solid lines. The similar residues are boxed 

with dotted lines. 

FIG. 4 shows a schematic representation of the human HDAC9 gene 
structure. The striped boxes represent exons present in isoforms HDRP, HDAC9a, 
and HDAC9. The lines represent introns. Broken lines are used for larger introns 
30 (with size in base pair on top). The 5 ' untranslated region cDNA and coding region 
cDNA are represented here. Exons 1-12 encode anon-catalytic domain of the 
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polypeptides, and exons 14-21 encode the histone deacetylase catalytic domain of 
the polypeptides, which provide the polypeptides with deacetylase activity. 

FIG. 5 is a schematic representation of the order in which FIGS. 5A-5D 
should be viewed. 

5 FIGS. 5A-5D show the nucleic acid sequence of HDAC9, containing all 

exons expressed in the various isoforms of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), and HDRP(ANLS) of the present invention (SEQ ID NO: 13). 

FIG. 6A is a scanned imaged of a multiple human tissue Northern blot that 
was probed to determine mRNA expression of HDAC9 using a cDNA probe that 

10 recognizes both HDAC9 and HDAC9a. The tissues examined are lane 1, heart; lane 
2, brain; lane 3, placenta; lane 4, lung; lane 5, liver; lane 6, skeletal muscle; lane 7, 
kidney; and lane 8, pancreas. Positions of the RNA size marker in kilobases (kb) are 
indicated to the left of the blot. 

FIG. 6B is a scanned image of an electrophoretic gel showing the results of 

1 5 RT-PCR analyses of mRNA from the same tissues as examined in the Northern blot 
of FIG. 6A to determine the distribution of HDAC9 and HDAC9a mRNA among 
these tissues. PGR products were resolved by agarose gel electrophoresis and 
visualized by ethidium bromide under UV light. A 1-kb DNA ladder was run on 
both sides of the gel with the size (in kb) indicated on the left. On the right side, the 

20 expected products for HDAC9 and HDAC9a are indicated as 9 and 9a, respectively. 
FIG. 7 is a graph of HDAC enzymatic activity of HDAC anti-FLAG- 
immunoprecipitated proteins isolated from vector control, HDAC9-FLAG, and 
HDAC9a-FLAG transfected 293T cells, as measured in fluorescence units using 
FLUOR DE LYS™ as a substrate in the presence or absence of 1 \iM TSA. Results 

25 are shown as the mean of three independent assays. The inset is a scanned image of 
an anti-FLAG Western blot showing the amount of proteins used in the assay. V, 
Vector control; 9, HDAC9-FLAG; and 9a, HDAC9a-FLAG. 

FIG. 8 is a graph of HDAC enzymatic activity of HDAC anti-FLAG- 
immunoprecipitated proteins isolated from vector control, and HDAC9a-FLAG 

30 (treated with 2 pM SAHA or left untreated) transfected 293T cells, as measured by 
3 H-acetic acid released from 3 H-histones in the presence or absence of 2 |iM SAHA. 
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Vector control; HDAC9a, HDAC9a-FLAG; and HDAC9a+, HDAC9a-FLAG + 
SAHA 

FIG. 9 A shows a scanned image of a Western blot of 293T whole cell lysate 
and anti-FLAG immunoprecipitates from 293T cells transfected with vector, 
5 HDAC9-FLAG or HDAC9a-FLAG using antibodies against MEF2 and FLAG. Top 
panel, anti-MEF2 Western; bottom panel, anti-FLAG Western. L, 293T whole cell 
lysate; V, vector control IP; 9, HDAC9-FLAG IP; 9a, HDAC9a-FLAG IP . 

FIG. 9B is a graph showing the transcription level of p3XMEF2-Zwc in the 
presence or absence of pcDNA3 empty vector (-), pCMV-MEF2C, and/or a vector 
10 encoding pFLAG-HDAC9 or pFLAG-HDAC9a. p3XMEF2-Zwc (100 ng) and pRL- 
TK (5 ng) were transfected into 293T cells with pcDNA3 empty vector (-) or with 
pCMV-MEF2C (100 ng) (+) along with the indicated amount of pFLAG-HDAC9 or 
pFLAG-HD AC9a. pFLAG empty vector was used to adjust the DNA to an equal 
amount in each transfection. The firefly luciferase activity was first normalized to 
15 the co-transfected Renilla luciferase activity and the value for MEF2C alone was 
then set as 1. Results are shown as the mean of three independent transfections +/- 
standard deviation. 

FIG. 10 shows a schematic representation of the HDAC domains of human 
non-Sir2 family HDACs and HDRP. The boxes represent histone deacetylase 
20 (HDAC) domains. 

FIG. 1 1 is a schematic representation of the order in which FIGS. 1 1 A-l IF 
should be viewed. 

FIGS. 11 A-l IF show the nucleotide sequence of the vector pFLAG-CMV- 
5b-HDAC9 (VR1) (SEQ ID NO: 14). Lowercase letters are vector backbone, 
25 uppercase letters are HDAC9 sequence. "Acc" was added at the beginning of the 
HDAC9 sequence for translation initiation. 

FIG. 12 is a schematic representation of the order in which FIGS. 12-1 
through 12-66 should be viewed. 

FIGS. 12-1 through 12-66 show the nucleotide sequence of the vector 
30 pFLAG-CMV-5b-HDAC9a (VR2), with restriction enzyme sites indicated (SEQ ID 
NO: 14). 
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FIG. 13 is a schematic representation of the order in which FIGS. 13A-13E 
should be viewed. 

FIGS. 13A-13E show the nucleotide sequence of the vector pFLAG-CMV- 
5b-HDAC9a (VR2) (SEQ ID NO: 15). Lowercase letters are vector backbone, 
5 uppercase letters are HDAC9a sequence. "Acc" was added at the beginning of the 
HDAC9a sequence for translation initiation. 

FIG. 14 is a schematic representation of the order in which FIGS. 14-1 
through 14-61 should be viewed. 

FIGS. 14-1 through 14-61 show the nucleotide sequence of the vector 
1 0 pFLAG-CMV-5b-HDAC9a (VR2), with restriction enzyme sites indicated (SEQ ID 
NO: 15). 

DETAILED DESCRIPTION OF THE INVENTION 

A protein designated HDRP (See Zhou et al. 9 Proc. Natl. Acad. Sci. USA, 

15 97:1056-1061 (2000)) (also called MTTR (See Sparrow et aL, EMBO J. 18:5085- 
5098(1999); Zhang et al, J. Biol. Chem., 276:35-39 (2001); and Zhang et al, Proc. 
Natl. Acad. Sci. USA, 98:7354-7359 (2001)) that is 50% identical to the N-tenninal 
domains of histone deacetylase 4 (HDAC4) and histone deacetylase 5 (HDAC5) was 
recently identified. The cloning and characterization of a novel histone deacetylase, 

20 HDAC9, of which HDRP is an alternatively spliced isoform is described herein. The 
cDNA sequence of HDAC9 is shown in FIGS. 1A-1C (SEQ ID NO: 1), and the 
HDAC9 amino acid sequence is shown in FIG. 2A (SEQ ID NO: 2). In addition to 
cloning HDAC9, other alternatively spliced isoforms of HDAC9, designated as 
HDAC9a (a polypeptide that is 132 amino acids shorter at the C-tenninal end than 

25 HDAC9), and isoforms of HDAC9, HDAC9a, and HDRP polypeptides that lack the 
nuclear localization signal (NLS) in the N-terminal non-catalytic end of HDAC9, 
termed HDAC9(ANLS), HDAC9a(ANLS), and HDRP(ANLS), respectively were 
also identified. The cDNA sequence of HDAC9a is shown in FIGS. 1D-1G (SEQ 
ID NO: 3), and the HDAC9a amino acid sequence is shown in FIG. 2B (SEQ ID 

30 NO: 4). The cDNA sequence of HDAC9 lacking amino acids encoding an NLS 

(HDA C9(ANLS)) is shown in FIGS. 1J-1L (SEQ ID NO: 5), and the HDAC9 lacking 
an NLS amino acid sequence is shown in FIG. 2C (SEQ ID NO: 6). The cDNA 
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sequence of HDAC9a encoding a polypeptide lacking an NLS (HDAC9a(ANLS)) is 
shown in FIGS. 1M-10 (SEQ ID NO: 7), and the HDAC9a lacking an NLS amino 
acid sequence is shown in FIG. 2D (SEQ ID NO: 8). The cDNA sequence of HDRP 
encoding a polypeptide lacking an NLS (HDRP(ANLS)) is shown in FIGS. 1H-1I 
5 (SEQ ID NO: 9), and the HDRP lacking an NLS amino acid sequence is shown in 
FIG. 2E (SEQ ID NO: 10). 

POLYPEPTIDES OF THE INVENTION 

The present invention features isolated or recombinant HDAC9 polypeptides, 

10 HDAC9a polypeptides, HDAC9(ANLS) polypeptides, HDAC9a(ANLS) 

polypeptides, and HDRP(ANLS) polypeptides, and fragments, derivatives, and 
variants thereof, as well as polypeptides encoded by nucleotide sequences described 
herein (e.g., other variants). As used herein, the term "polypeptide" refers to a 
polymer of amino acids, and not to a specific length; thus, peptides, oligopeptides, 

1 5 and proteins are included within the definition of a polypeptide. 

As used herein, a polypeptide is said to be "isolated," "substantially pure," or 
"substantially pure and isolated" when it is substantially free of cellular material, 
when it is isolated from recombinant or non-recombinant cells, or free of chemical 
precursors or other chemicals when it is chemically synthesized. Typically, the 

20 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

polypeptide is isolated, substantially pure, or substantially pure and isolated when it 
has a relative increased concentration or activity of HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS), in comparison to total HDAC 
concentration or activity. Preferably the increased activity or concentration of the 

25 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) is at least 
2-fold, more preferably, at least 5-fold, and most preferably, at least 10 fold, in 
comparison to total HDAC concentration or activity. In addition, a polypeptide can 
be joined to another polypeptide with which it is not normally associated in a cell 
(e.g., in a "fusion protein") and still be "isolated " "substantially pure," or 

30 "substantially pure and isolated." An isolated, substantially pure, or substantially 
pure and isolated polypeptide may be obtained, for example, using affinity 
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purification techniques described herein, as well as other techniques described herein 
and known to those skilled in the art. 

By a "histone deacetylase polypeptide" is meant a polypeptide having histone 
deacetylase activity, transcription repression activity, and/or the ability to deacetylate 
5 other substrates, for example, transcription factors, including p53, CoRest, E2F, 
GATA-1, TFHe, and TFHF that normally have a nuclear or cytoplasmic location in a 
cell. A histone deacetylase polypeptide is also a polypeptide whose activity can be 
inhibited by molecules having HDAC inhibitory activity. These molecules fall into 
four general classes: 1) short-chain fatty acids (e.g., 4-phenylbutyrate and valproic 

10 acid); 2) hydroxamic acids(e.g. SAHA, Pyroxamide, trichostatin A (TSA), 

oxamflatin and CHAPs, such as, CHAP1 and CHAP 31); 3) cyclic tetrapeptides 
(Trapoxin A, Apicidin and Depsipeptide (FK-228, also known as FR901 1228); 4) 
benzamides (e.g., MS-275); and other compounds such as Scriptaid. Examples of 
such compounds can be found in U.S. Patent Nos. 5,369,108, issued on November 

15 29, 1994, 5,700,81 1, issued on December 23, 1997, and 5,773,474, issued on June 
30, 1998 to Breslow et al. 9 U.S. Patent Nos. 5,055,608, issued on October 8, 1991, 
and 5,175,191, issued on December 29, 1992 to Marks et aL, as well as, Yoshida et - 
al 9 Bioessays 17, 423-430 (1995), Saito et aL, PNAS USA 96, 4592-4597, (1999), 
Furamai et aL, PNAS USA 98 (1), 87-92 (2001), Komatsu et aL, Cancer Res. 

20 61(11), 4459-4466 (2001), Su etaL 9 Cancer Res. 60, 3137-3142 (2000), Lee etaL, 
Cancer Res. 61(3), 931-934 and Suzuki etaL J. Med. Chem. 42(15), 3001-3003 
(1999) the entire content of all of which are hereby incorporated by reference. 
Examples of such histone deacetylase polypeptides include HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), HDRP(ANLS); a substantially pure polypeptide 

25 comprising SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ 
ID NO: 10; and a polypeptide having preferably at least 60%, more preferably, 70%, 
75%, 80%, 85%, or 90%, and most preferably, 95% sequence identity to any one of 
SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10, 
as determined using the BLAST program and parameters described herein. 

30 In one embodiment, the histone deacetylase polypeptide has histone 

deacetylase activity, transcription repression activity, the ability to deacetylate 
substrates, or is inhibited by trichostatin A or a hybrid polar compound such as 
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SAHA. In another embodiment, the HDAC9(ANLS) polypeptide has any two of the 
above biological activities. In still another embodiment, the HDAC9(ANLS) 
polypeptide has any three of the above biological activities. In yet another 
embodiment, the HDAC9(ANLS) polypeptide has all of the above biological 
5 activities. 

An HDAC9 polypeptide is a histone deacetylase polypeptide as described 
above. An HDAC9 polypeptide preferably has at least 60%, more preferably, 70%, 
75%, 80%, 85%, or 90%, and most preferably, 95% sequence identity to SEQ ID 
NO: 2, as determined using the BLAST program and parameters described herein. 

10 An HDAC9 polypeptide is also a polypeptide that comprises the amino acids 

encoded by exons 23, 24, 25 and/or 26, and that does not comprise the amino acids 
encoded by exon 13 of the HDAC9 nucleic acid sequence, as shown in FIGS. 1A- 
1C, FIG. 4, and FIGS. 5A-5D. Preferably, an HDAC9 polypeptide comprises the 
sequence of SEQ ID NO: 2. More preferably, an HDAC9 polypeptide consists of 

15 the sequence of SEQ ID NO: 2. An HDAC polypeptide is also a polypeptide 

comprising the amino acid sequence of the polypeptide encoded by the nucleic acid 
sequence of SEQ ID NO: 1. 

An HDAC9a polypeptide is a histone deacetylase polypeptide as described 
above. An HDAC9a polypeptide preferably has at least 60%, more preferably, 70%, 

20 75%, 80%, 85%, or 90%, and most preferably, 95% sequence identity to SEQ ID 
NO: 4, as determined using the BLAST program and parameters described herein. 
An HDAC9a polypeptide is also a polypeptide that comprises the amino acids 
encoded by exon 22, and that does not comprise the amino acids encoded by exons 
13, 23, 24, 25, or 26 of the HDAC9 nucleic acid sequence, as shown in FIGS. 1D- 

25 1G, FIG. 4, and FIGS. 5A-5D. Preferably, an HDAC9a polypeptide comprises the 
sequence of SEQ ID NO: 4. More preferably, an HDAC9a polypeptide consists of 
the sequence of SEQ ID NO: 4. An HDAC9a polypeptide is also a polypeptide 
comprising the amino acid sequence of the polypeptide encoded by the nucleic acid 
sequence of SEQ ID NO: 3. 

30 An HDAC9(ANLS) is a histone deacetylase polypeptide as described above. 

An HDAC9(ANLS) polypeptide does not comprise a nuclear localization signal 
(NLS). An HDAC9(ANLS) polypeptide preferably has at least 60%, more 
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preferably, 70%, 75%, 80%, 85%, or 90%, and most preferably, 95% sequence 
identity to SEQ ID NO: 6, as determined using the BLAST program and parameters 
described herein. An HDAC9(ANLS) polypeptide is also a polypeptide that 
comprises the amino acids encoded by exons 23, 24, 25, and/or 26, and that does not 
5 comprise the amino acids encoded by exons 7 or 13 of the HDAC9 nucleic acid 
sequence, as shown in FIGS. 1 J-1L, and FIGS. 5A-5D. Preferably, an 
HDAC9(ANLS) polypeptide comprises the sequence of SEQ ID NO: 6. More 
preferably, an HDAC9(ANLS) polypeptide consists of the sequence of SEQ ID NO: 
6. An HDAC9(ANLS) polypeptide is also a polypeptide comprising the amino acid 

10 sequence of the polypeptide encoded by the nucleic acid sequence of SEQ ID NO: 5. 
An HDAC9a(ANLS) polypeptide is a histone deacetylase polypeptide as 
described above. An HDAC9a(ANLS) does not comprise a nuclear localization 
signal (NLS). An HDAC9a(ANLS) polypeptide preferably has at least 60%, more 
preferably, 70%, 75%, 80%, 85%, or 90%, and most preferably, 95% sequence 

15 identity to SEQ ID NO: 8, as determined using the BLAST program and parameters 
described herein. An HDAC9a(ANLS) polypeptide is also a polypeptide that 
comprises the amino acids encoded by exon 22, and that does not comprise the 
amino acids encoded by exons 7, 13, 23, 24, 25, or 26 of the HDAC9 nucleic acid 
sequence, as shown in FIGS. 1M-10, and FIGS. 5A-5D. Preferably, an 

20 HDAC9a(ANLS) polypeptide comprises the sequence of SEQ ID NO: 8. More 

preferably, an HDAC9a(ANLS) polypeptide consists of the sequence of SEQ ID NO: 
8. An HDAC9a(ANLS) polypeptide is also a polypeptide comprising the amino acid 
sequence of the polypeptide encoded by the nucleic acid sequence of SEQ ID NO: 7. 
An HDRP(ANLS) polypeptide is a histone deacetylase polypeptide as 

25 described above. An HDRP(ANLS) does not comprise a nuclear localization signal 
(NLS). An HDRP(ANLS) polypeptide preferably has at least 60%, more preferably, 
70%, 75%, 80%, 85%, or 90%, and most preferably, 95% sequence identity to SEQ 
ID NO: 10, as determined using the BLAST program and parameters described 
herein. An HDRP(ANLS) polypeptide is also a polypeptide that does not comprise 

30 the amino acids encoded by exons 7 or 13-26 of the HDAC9 nucleic acid sequence, 
as shown in FIGS. 1H-1I and FIGS. 5A-5D. Preferably, an HDRP(ANLS) 
polypeptide comprises the sequence of SEQ ID NO: 10. More preferably, an 
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HDRP(ANLS) polypeptide consists of the sequence of SEQ ID NO: 10. An 
HDRP(ANLS) polypeptide is also a polypeptide comprising the amino acid sequence 
of the polypeptide encoded by the nucleic acid sequence of SEQ ID NO: 9. 

The polypeptides of the invention can be purified to homogeneity. It is 
5 understood, however, that preparations in which the polypeptide is not purified to 
homogeneity are useful. The critical feature is that the preparation allows for the 
desired function of the polypeptide, even in the presence of considerable amounts of 
other components. Thus, the invention encompasses various degrees of purity. In 
one embodiment, the language "substantially free of cellular material" includes 

10 preparations of the polypeptide having less than about 30% (by dry weight) other 
proteins (i.e., contaminating protein), less than about 20% other proteins, less than 
about 10% other proteins, or less than about 5% other proteins. 

When a polypeptide is recombinantly produced, it can also be substantially 
free of culture medium, i.e., culture medium represents less than about 20%, less 

15 than about 10%, or less than about 5% of the volume of the polypeptide preparation. 
The language "substantially free of chemical precursors or other chemicals" includes 
preparations of the polypeptide in which it is separated from chemical precursors or 
other chemicals that are involved in its synthesis. In one embodiment, the language 
"substantially free of chemical precursors or other chemicals" includes preparations 

20 of the polypeptide having less than about 30% (by dry weight) chemical precursors 
or other chemicals, less than about 20% chemical precursors or other chemicals, less 
than about 10% chemical precursors or other chemicals, or less than about 5% 
chemical precursors or other chemicals. 

In one embodiment, a polypeptide of the invention comprises an amino acid 

25 sequence encoded by a nucleic acid molecule comprising a nucleotide sequence 

selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, 
SEQ ID NO: 7, SEQ ID NO: 9, and complements and portions thereof, (e.g., a 
complement of any one of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 
NO: 7, SEQ ID NO: 9 or a portion of any one of SEQ ID NO: 1 or SEQ ID NO: 3, 

30 SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9). 

The polypeptides of the invention also encompass fragments and sequence 
variants. Variants include a substantially homologous polypeptide encoded by the 
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same genetic locus in an organism, i.e., an allelic variant, as well as other variants. 
Variants also encompass polypeptides derived from other genetic loci in an 
organism, but having substantial homology to a polypeptide encoded by a nucleic 
acid molecule comprising a nucleotide sequence selected from the group consisting 
5 of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, 
and complements and portions thereof, or having substantial homology to a 
polypeptide encoded by a nucleic acid molecule comprising a nucleotide sequence 
selected from the group consisting of nucleotide sequences encoding any one of SEQ 
ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10. 

10 Variants also include polypeptides substantially homologous or identical to these 
polypeptides but derived from another organism, Le. 9 an ortholog. Variants also 
include polypeptides that are substantially homologous or identical to these 
polypeptides that are produced by chemical synthesis. Variants also include 
polypeptides that are substantially homologous or identical to these polypeptides that 

15 are produced by recombinant methods. 

As used herein, two polypeptides (or a region of the polypeptides) are 
substantially homologous or identical when the amino acid sequences are at least 
about 60-65%, typically at least about 70-75%, more typically at least about 80-85%, 
and most typically greater than about 90-95% or more homologous or identical. A 

20 substantially identical or homologous amino acid sequence, according to the present 
invention, will be encoded by a nucleic acid molecule hybridizing to SEQ ID NO: 1, 
SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, or a portion thereof, 
under stringent conditions as more particularly described herein, or will be encoded 
by a nucleic acid molecule hybridizing to a nucleic acid sequence encoding SEQ ID 

25 NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, or portion 
thereof, under stringent conditions as more particularly described herein. 

The percent identity of two nucleotide or amino acid sequences can be 
determined by aligning the sequences for optimal comparison purposes (e.g., gaps 
can be introduced in the sequence of a first sequence). The nucleotides or amino 

30 acids at corresponding positions are then compared, and the percent identity between 
the two sequences is a function of the number of identical positions shared by the 
sequences (z.e., % identity = # of identical positions/total # of positions x 100). In 



WO 02/102984 



-17- 



PCT/US02/19051 



certain embodiments, the length of the HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), and HDRP(ANLS) amino acid or nucleotide sequence aligned for 
comparison purposes is at least 30% preferably, at least 40%, more preferably, at 
least 60%, and even more preferably, at least 70%, 80%, 90%, or 100% of the length 
5 of the reference sequence, for example, those sequences provided in FIGS. 1A-10 
and 2A-2E. The actual comparison of the two sequences can be accomplished by 
well-known methods, for example, using a mathematical algorithm. A preferred, 
non-limiting example of such a mathematical algorithm is described in Karlin et aL, 
Proc. Natl. Acad. Sci. USA, 90:5873-5877 (1993). Such an algorithm is 

10 incorporated into the BLASTN and BLASTX programs (version 2.2) as described in 
Schaffer et ah, Nucleic Acids Res., 29:2994-3005 (2001). When utilizing BLAST 
and Gapped BLAST programs, the default parameters of the respective programs 
(e.g., BLASTN) can be used. See http://www.ncbi.nlm.nih.gov, as available on 
August 10, 2001. In one embodiment, the database searched is a non-redundant 

15 (NR.) database, and parameters for sequence comparison can be set at: no filters; 
Expect value of 10; Word Size of 3; the Matrix is BLOSUM62; and Gap Costs have 
an Existence of 1 1 and an Extension of 1 . 

Another preferred, non-limiting example of a mathematical algorithm 
utilized for the comparison of sequences is the algorithm of Myers and Miller, 

20 CABIOS (1 989). Such an algorithm is incorporated into the ALIGN program 
(version 2.0), which is part of the GCG (Accelrys) sequence alignment software 
package. When utilizing the ALIGN program for comparing amino acid sequences, 
a PAM120 weight residue table, a gap length penalty of 12 , and a gap penalty of 4 
can be used. Additional algorithms for sequence analysis are known in the art and 

25 include ADVANCE and ADAM as described in Torellis and Robotti, Comput. 
Appl. Biosci., 10: 3-5 (1994); and FASTA described in Pearson and Lipman, Proc. 
Natl. Acad. Sci USA, 85: 2444-8 (1988). 

In another embodiment, the percent identity between two amino acid 
sequences can be accomplished using the GAP program in the GCG software 

30 package (available at http://www.accelrys.com, as available on August 3 1 , 2001) 
using either a Blossom 63 matrix or a PAM250 matrix, and a gap weight of 12, 10, 
8, 6, or 4 and a length weight of 2, 3, or 4. In yet another embodiment, the percent 
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identity between two nucleic acid sequences can be accomplished using the GAP 
program in the GCG software package (available at http://www.cgc.com), using a 
gap weight of 50 and a length weight of 3. 

The invention also encompasses HDAC9, HDAC9a, HDAC9(ANLS), 
5 HD AC9aANLS, and HDRP(ANLS) polypeptides having a lower degree of identity 
but having sufficient similarity so as to perform one or more of the same functions 
performed by an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9aANLS, or 
HDRP(ANLS) polypeptide encoded by a nucleic acid molecule of the invention. 
Similarity is determined by conserved amino acid substitution. Such substitutions 

10 are those that substitute a given amino acid in a polypeptide by another amino acid 
of like characteristics. Conservative substitutions are likely to be phenotypically 
silent. Typically seen as conservative substitutions are the replacements, one for 
another, among the aliphatic amino acids Ala, Val, Leu, and He; interchange of the 
hydroxyl residues Ser and Thr; exchange of the acidic residues Asp and Glu; 

1 5 substitution between the amide residues Asn and Gin; exchange of the basic residues 
Lys and Arg; and replacements among the aromatic residues Phe and Tyr; Guidance 
concerning which amino acid changes are likely to be phenotypically silent are found 
in Bowie et al, Science 247: 1306-13 10 (1990). 

A variant polypeptide can differ in amino acid sequence by one or more 

20 substitutions, deletions, insertions, inversions, fusions, and truncations or a 

combination of any of these. Further, variant polypeptides can be fully functional or 
can lack function in one or more activities, for example, in histone deacetylase 
activity or transcription repression activity. Fully functional variants typically 
contain only conservative variation or variation in non-critical residues or in 

25 non-critical regions. Functional variants can also contain substitution of similar 
amino acids that result in no change or an insignificant change in function. 
Alternatively, such substitutions may positively or negatively affect function to some 
degree. Non-functional variants typically contain one or more non-conservative 
amino acid substitutions, deletions, insertions, inversions, or truncations or a 

30 substitution, insertion, inversion, or deletion in a critical residue or critical region, 
such critical regions include the HDAC domains, which provide the polypeptide 
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with deacetylase activity, as shown in the nucleic acid sequences of FIGS. 1 A-1G, as 
well as in the schematic of FIG. 4. 

Amino acids that are essential for function can be identified by methods 
known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis 
5 (Cunningham et al, Science, 244: 1081-1085 (1989)). The latter procedure 
introduces a single alanine mutation at each of the residues in the molecule (one 
mutation per molecule). The resulting mutant molecules are then tested for 
biological activity in vitro. Sites that are critical for polypeptide activity can also be 
determined by structural analysis, such as crystallization, nuclear magnetic 

10 resonance, or photoaffinity labeling (See Smith et a/., J. Mol. Biol, 224: 899-904 
(1992); and de Vos etal Science, 255: 306-312 (1992)). 

The invention also includes HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), and HDRP(ANLS) polypeptide fragments of the polypeptides of 
the invention. Fragments can be derived from a polypeptide comprising SEQ ID 

15 NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10, or from 
a polypeptide encoded by a nucleic acid molecule comprising SEQ ID NO: 1, SEQ 
ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9 or a portion thereof and 
the complements thereof or other variants. The present invention also encompasses 
fragments of the variants of the polypeptides described herein. Useful fragments 

20 include those that retain one or more of the biological activities of the polypeptide as 
well as fragments that can be used as an immunogen to generate polypeptide-specific 
antibodies. 

Biologically active fragments (peptides that are, for example, 6, 9, 12, 15, 16, 
20, 30, 35, 36, 37, 38, 39, 40, 50, 100, or more amino acids in length) can comprise 

25 a domain, segment, or motif, for example, an HDAC domain, that has been 

identified by analysis of the polypeptide sequence using well-known methods, e.g., 
signal peptides, extracellular domains, one or more transmembrane segments or 
loops, ligand binding regions, zinc finger domains, DNA binding domains, acylation 
sites, glycosylation sites, or phosphorylation sites. 

30 Fragments can be discrete (not fused to other amino acids or polypeptides) or 

can be within a larger polypeptide. Further, several fragments can be comprised 
within a single larger polypeptide. In one embodiment a fragment designed for 
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expression in a host can have heterologous pre- and pro-polypeptide regions fused to 
the amino terminus of the polypeptide fragment and an additional region fused to the 
carboxyl terminus of the fragment. 

The invention thus provides chimeric or fusion polypeptides. These 
5 comprise an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9aANLS, or HDRP(ANLS) 
polypeptide of the invention operatively linked to a heterologous protein or 
polypeptide having an amino acid sequence not substantially homologous to the 
polypeptide. "Operatively linked" indicates that the polypeptide and the 
heterologous protein are fused in-frame. The heterologous protein can be fused to 

10 the N-terminus or C-terminus of the polypeptide. In one embodiment, the fusion 
polypeptide does not affect the function of the polypeptide per se. For example, the 
fusion polypeptide can be a GST-fusion polypeptide in which the polypeptide 
sequences are fused to the C-terminus of the GST sequences. Other types of fusion 
polypeptides include, but are not limited to, enzymatic fusion polypeptides, for 

15 example, P-galactosidase fusions, yeast two-hybrid GAL fusions, poly-His fusions, 
and Ig fusions. Such fusion polypeptides, particularly poly-His fusions, can 
facilitate the purification of recombinant polypeptide. In certain host cells (e.g., 



increased by using a heterologous signal sequence. Therefore, in another 
20 embodiment, the fusion polypeptide contains a heterologous signal sequence at its 
N-terminus. 

EP-A 0464 533 discloses fusion proteins comprising various portions of 
immunoglobulin constant regions. The Fc is useful in therapy and diagnosis and 
thus results, for example, in improved pharmacokinetic properties (EP-A 0232 262). 

25 hi drug discovery, for example, human proteins have been fused with Fc portions for 
the purpose of high-throughput screening assays to identify antagonists. (See 
Bennett et ai, Journal of Molecular Recognition, 8: 52-58 (1995) and Johanson et 
aL, The Journal of Biological Chemistry, 270,16: 9459-9471 (1995)). Thus, this 
invention also encompasses soluble fusion polypeptides containing a polypeptide of 

30 the invention and various portions of the constant regions of heavy or light chains of 
immunoglobulins of various subclass (IgG, IgM, IgA, IgE). 




cells), expression and/or secretion of a polypeptide can be 
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A chimeric or fusion polypeptide can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide 
sequences are ligated together in-frame in accordance with conventional techniques. 
In another embodiment, the fusion gene can be synthesized by conventional 
5 techniques including automated DNA synthesizers. Alternatively, PCR 

amplification of nucleic acid fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive nucleic acid 
fragments that can subsequently be annealed and re-amplified to generate a chimeric 
nucleic acid sequence (see Ausubel et al, "Current Protocols in Molecular Biology," 

10 John Wiley & Sons, (1998), the entire teachings of which are incorporated by 
reference herein). Moreover, many expression vectors are commercially available 
that already encode a fusion moiety (e.g., a GST protein). A nucleic acid molecule 
encoding a polypeptide of the invention can be cloned into such an expression vector 
such that the fusion moiety is linked in-frame to the polypeptide. 

1 5 The substantially pure, isolated, or substantially pure and isolated HDAC9, 

HDAC9a, HDAC9(ANLS), HDAC9aANLS, or HDRP(ANLS) polypeptide can be 
purified from cells that naturally express it, purified from cells that have been altered 
to express it (recombinant), or synthesized using known protein synthesis methods. 
In one embodiment, the polypeptide is produced by recombinant DNA techniques. 

20 For example, a nucleic acid molecule encoding the polypeptide is cloned into an 
expression vector, the expression vector introduced into a host cell, and the 
polypeptide expressed in the host cell. The polypeptide can then be isolated from 
the cells by an appropriate purification scheme using standard protein purification 
techniques. 

25 In general, HDAC9, HDAC9a, HDAC9(ANLS), HDAC9aANLS, and 

HDRP(ANLS) polypeptides of the present invention can be used as a molecular 
weight marker on SDS-PAGE gels or on molecular sieve gel filtration columns 
using art-recognized methods. The polypeptides of the present invention can be 
used to raise antibodies or to elicit an immune response. The polypeptides can also 

30 be used as a reagent, e.g., a labeled reagent, in assays to quantitatively determine 
levels of the polypeptide or a molecule to which it binds (e.g., a receptor or a ligand) 
in biological fluids. The polypeptides can also be used as markers for cells or tissues 
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in which the corresponding polypeptide is preferentially expressed, either 
constitutively, during tissue differentiation, or in a diseased state. The polypeptides 
can be used to isolate a corresponding binding agent, and to screen for peptide or 
small molecule antagonists or agonists of the binding interaction. The polypeptides 
5 of the present invention can also be used as therapeutic agents. 

NUCLEIC ACID MOLECULES OF THE INVENTION 

The present invention also features isolated HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS) 9 and HDRP(ANLS) nucleic acid molecules. 

10 By a "histone deacetylase nucleic acid molecule" is meant a nucleic acid 

molecule that encodes a histone deacetylase polypeptide. Such histone nucleic acids 
include, for example, the HDAC9, HDAC9a 9 HDA C9(ANLS), HDA C9a(ANLS), or 
HDRP(ANLS) nucleic acid molecule described in detail herein; an isolated nucleic 
acid comprising SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or 

15 SEQ ID NO: 9; a complement of an isolated nucleic acid comprising SEQ ID NO: 1 , 
SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9; an isolated 
nucleic acid encoding a histone deacetylase polypeptide of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ED NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10; a complement of an 
isolated nucleic acid encoding a histone deacetylase polypeptide of SEQ ID NO: 2, 

20 SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10; a nucleic acid 
that is hybridizeable under high stringency conditions to a nucleic acid molecule that 
encodes any of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, or SEQ ID NO: 8, or 
a complement thereof; a nucleic acid molecule that is hybridizeable under high 
stringency conditions to a nucleic acid comprising SEQ ID NO: 1, SEQ ID NO: 3, 

25 SEQ ID NO: 5, or SEQ ID NO: 7; and an isolated nucleic acid molecule that has at 
least 55%, more preferably, 60%, 65%, 70%, 75%, 80%, 85%, or 90%, and most 
preferably, 95% or 99% sequence identity with any one of SEQ ID NO: 1, SEQ ID 
NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, or a complement thereof. 

An HDAC9 nucleic acid molecule is a nucleic acid molecule that encodes an 

30 HDAC9 polypeptide. In one embodiment, the HDAC9 nucleic acid molecule is 
selected from: a nucleic acid molecule that comprises the nucleic acid sequence of 
SEQ ID NO: 1; a complement of an isolated nucleic acid comprising SEQ ID NO: 1; 
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an isolated nucleic acid encoding a histone deacetylase polypeptide of SEQ ID NO: 
2; a complement of an isolated nucleic acid encoding a histone deacetylase 
polypeptide of SEQ ID NO: 2; a nucleic acid that is hybridizeable under high 
stringency conditions to a nucleic acid molecule that encodes SEQ ID NO: 2; a 
5 nucleic acid molecule that is hybridizeable under high stringency conditions to a 
nucleic acid comprising SEQ ID NO: 1; and an isolated nucleic acid molecule that 
has preferably, at least 55%, more preferably, 60%, 65%, 70%, 75%, 80%, 85%, or 
90%, and most preferably, 95% or 99% sequence identity with SEQ ID NO: 1, as 
determined using the BLAST program and parameters described herein. In another 

10 embodiment, the HDAC9 nucleic acid molecule consists of the nucleic acid 
sequence of SEQ ID NO: 1 . 

An HDAC9a nucleic acid molecule is a nucleic acid molecule that encodes 
an HDAC9a polypeptide. An HDAC9a nucleic acid molecule preferably has at least 
55%, sequence identity to SEQ ID NO: 3, In one embodiment, the HDAC9a nucleic 

15 acid molecule is selected from: a nucleic acid molecule that comprises the nucleic 
acid sequence of SEQ ID NO: 3; a complement of an isolated nucleic acid 
comprising SEQ ID NO: 3; an isolated nucleic acid encoding a histone deacetylase 
polypeptide of SEQ ID NO: 4; a complement of an isolated nucleic acid encoding a 
histone deacetylase polypeptide of SEQ ID NO: 4; a nucleic acid that is 

20 hybridizeable under high stringency conditions to a nucleic acid molecule that 
encodes SEQ ID NO: 4; a nucleic acid molecule that is hybridizeable under high 
stringency conditions to a nucleic acid comprising SEQ ID NO: 3; and an isolated 
nucleic acid molecule that has preferably, at least 55%, more preferably, 60%, 65%, 
70%, 75%, 80%, 85%, or 90%, and most preferably, 95% or 99% sequence identity 

25 with SEQ ID NO: 3 or a complement thereof, as determined using the BLAST 
program and parameters described herein. In another embodiment, the HDAC9a 
nucleic acid molecule consists of the nucleic acid sequence of SEQ ID NO: 3. 

An HDAC9(ANLS) nucleic acid molecule is a nucleic acid molecule that 
encodes an HDAC9(ANLS) polypeptide. In one embodiment, the HDAC9(ANLS) 

30 nucleic acid molecule is selected from: a nucleic acid molecule that comprises the 
nucleic acid sequence of SEQ ID NO: 5; a complement of an isolated nucleic acid 
comprising SEQ ID NO: 5; an isolated nucleic acid encoding a histone deacetylase 
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polypeptide of SEQ ID NO: 6; a complement of an isolated nucleic acid encoding a 
histone deacetylase polypeptide of SEQ ID NO: 6; a nucleic acid that is 
hybridizeable under high stringency conditions to a nucleic acid molecule that 
encodes SEQ ID NO: 6; a nucleic acid molecule that is hybridizeable under high 
5 stringency conditions to a nucleic acid comprising SEQ ID NO: 5; and an isolated 
nucleic acid molecule that has preferably, at least 55%, more preferably, 60%, 65%, 
70%, 75%, 80%, 85%, or 90%, and most preferably, 95% or 99% sequence identity 
with SEQ ID NO: 5 or a complement thereof, as determined using the BLAST 
program and parameters described herein. In another embodiment, the 
1 0 HDAC9(ANLS) nucleic acid molecule consists of the nucleic acid sequence of SEQ 
ID NO: 5. 

An HDAC9a(ANLS) nucleic acid molecule is a nucleic acid molecule that 
encodes an HDAC9a(ANLS) polypeptide. In one embodiment, the HDA C9a(ANLS) 
nucleic acid molecule is selected from: a nucleic acid molecule that comprises the 

1 5 nucleic acid sequence of SEQ ID NO: 7; a complement of an isolated nucleic acid 
comprising SEQ ID NO: 7; an isolated nucleic acid encoding a histone deacetylase 
polypeptide of SEQ ID NO: 8; a complement of an isolated nucleic acid encoding a 
histone deacetylase polypeptide of SEQ ID NO: 8; a nucleic acid that is 
hybridizeable under high stringency conditions to a nucleic acid molecule that 

20 encodes SEQ ID NO: 8; a nucleic acid molecule that is hybridizeable under high 
stringency conditions to a nucleic acid comprising SEQ ID NO: 7; and an isolated 
nucleic acid molecule that has preferably, at least 55%, more preferably, 60%, 65%, 
70%, 75%, 80%, 85%, or 90%, and most preferably, 95% or 99% sequence identity 
with SEQ ID NO: 7 or a complement thereof, as determined using the BLAST 

25 program and parameters described herein, hi another embodiment, the 

HDAC9a(ANLS) nucleic acid molecule consists of the nucleic acid sequence of SEQ 
ID NO: 7. 

An "HDRP(ANLS) nucleic acid molecule" is a nucleic acid molecule that 
encodes an HDRP(ANLS) polypeptide. In one embodiment, the HDRP(ANLS) 
30 nucleic acid molecule is selected from: a nucleic acid molecule that comprises the 
nucleic acid sequence of SEQ ID NO: 9; a complement of an isolated nucleic acid 
comprising SEQ ID NO; 9; an isolated nucleic acid encoding a histone deacetylase 
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polypeptide of SEQ ID NO: 10; a complement of an isolated nucleic acid encoding a 
histone deacetylase polypeptide of SEQ ID NO: 10; and an isolated nucleic acid 
molecule that has preferably, at least 55%, more preferably, 60%, 65%, 70%, 75%, 
80%, 85%, or 90%, and most preferably, 95% or 99% sequence identity with SEQ 
5 ID NO: 9 or a complement thereof, as determined using the BLAST program and 
parameters described herein.. In another embodiment, the HDRP(ANLS) nucleic 
acid molecule consists of the nucleic acid sequence of SEQ ID NO: 9. 

The isolated nucleic acid molecules of the present invention can be RNA, for 
example, mRNA, or DNA, such as cDNA and genomic DNA. DNA molecules can 

10 be double-stranded or single-stranded; single stranded RNA or DNA can be either 
the coding, or sense, strand or the non-coding, or antisense, strand. The nucleic acid 
molecule can include all or a portion of the coding sequence of the gene and can 
further comprise additional non-coding sequences such as introns and non-coding 3* 
and 5' sequences (including regulatory sequences, for example). Additionally, the 

15 nucleic acid molecule can be fused to a marker sequence, for example, a sequence 
that encodes a polypeptide to assist in isolation or purification of the polypeptide. 
Such sequences include, but are not limited to, those that encode a 
glutathione-S-transferase (GST) fusion protein and those that encode a 
hemagglutinin A (HA) polypeptide marker from influenza. 

20 An "isolated," "substantially pure," or "substantially pure and isolated" 

nucleic acid molecule, as used herein, is one that is separated from nucleic acids that 
normally flank the gene or nucleotide sequence (as in genomic sequences) and/or has 
been completely or partially purified from other transcribed sequences (e.g., as in an 
RNA or cDNA library). For example, an isolated nucleic acid of the invention may 

25 be substantially isolated with respect to the complex cellular milieu in which it 
naturally occurs, or culture medium when produced by recombinant techniques, or 
chemical precursors or other chemicals when chemically synthesized. In some 
instances, the isolated material will form part of a composition (for example, a crude 
extract containing other substances), buffer system, or reagent mix. In other 

30 circumstances, the material may be purified to essential homogeneity, for example, 
as determined by agarose gel electrophoresis or column chromatography such as 
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HPLC. Preferably, an isolated nucleic acid molecule comprises at least about 50, 80, 
or 90% (on a molar basis) of all macromolecular species present. 

With regard to genomic DNA, the term "isolated" also can refer to nucleic 
acid molecules that are separated from the chromosome with which the genomic 
. 5 DNA is naturally associated. For example, the isolated nucleic acid molecule can 
contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotides 
that flank the nucleic acid molecule in the genomic DNA of the cell from which the 
nucleic acid molecule is derived. 

The HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

1 0 nucleic acid molecule can be fused to other coding or regulatory sequences and still 
be considered isolated. Thus, recombinant DNA contained in a vector is included in 
the definition of "isolated" as used herein. Also, isolated nucleic acid molecules 
include recombinant DNA molecules in heterologous host cells, as well as partially 
or substantially purified DNA molecules in solution. "Isolated" nucleic acid 

15 molecules also encompass in vivo and in vitro RNA transcripts of the DNA 

molecules of the present invention. An isolated nucleic acid molecule or nucleotide 
sequence can include a nucleic acid molecule or nucleotide sequence that is 
synthesized chemically or by recombinant means. Therefore, recombinant DNA 
contained in a vector are included in the definition of "isolated" as used herein. 

20 Isolated nucleotide molecules also include recombinant DNA molecules in 

heterologous organisms, as well as partially or substantially purified DNA molecules 
in solution. In vivo and in vitro RNA transcripts of the DNA molecules of the 
present invention are also encompassed by "isolated" nucleotide sequences. Such 
isolated nucleotide sequences are useful in the manufacture of the encoded 

25 polypeptide, as probes for isolating homologous sequences (e.g., from other 
mammalian species), for gene mapping (e.g., by in situ hybridization with 
chromosomes), or for detecting expression of the gene in tissue (e.g., human tissue), 
such as by Northern blot analysis. 

The present invention also pertains to variant HDAC9, HDAC9a, 

30 HDAC9(ANLS), HDAC9a(ANLS), and HDRP(ANLS) nucleic acid molecules that are 
not necessarily found in nature but that encode an HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide. Thus, for 
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example, DNA molecules that comprise a sequence that is different from the 
naturaUy-occurring HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS) 9 or 
HDRP(ANLS) nucleotide sequence but which, due to the degeneracy of the genetic 
code, encode an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
5 HDRP(ANLS) polypeptide of the present invention are also the subject of this 
invention. 

The invention also encompasses HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), and HDRP(ANLS) nucleotide sequences encoding portions 
(fragments), or encoding variant polypeptides such as analogues or derivatives of an 

10 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

polypeptide. Such variants can be naturally-occurring, such as in the case of allelic 
variation or single nucleotide polymorphisms, or non-naturally-occurring, such as 
those induced by various mutagens and mutagenic processes. Intended variations 
include, but are not limited to, addition, deletion, and substitution of one or more 

1 5 nucleotides that can result in conservative or non-conservative amino acid changes, 
including additions and deletions. Preferably, the HDAC9, HDAC9a, 
. HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleotide (and/or resultant 
amino acid) changes are silent or conserved; that is, they do not alter the 
characteristics or activity of the HDAC9, HDAC9a, HDAC9(ANLS), 

20 HDAC9a(ANLS), or HDRP(ANLS) polypeptide. In one preferred embodiment, the 
nucleotide sequences are fragments that comprise one or more polymorphic 
microsatellite markers. 

Other alterations of the HDAC9, HDAC9a, HDAC9(ANLS) 9 
HDAC9a(ANLS), or HDRP(ANLS) nucleic acid molecules of the invention can 

25 include, for example, labeling, methylation, internucleotide modifications such as 
uncharged linkages {e.g., methyl phosphonates, phosphotriesters, phosphoamidates, 
and carbamates), charged linkages {e.g., phosphorothioates or phosphorodithioates), 
pendent moieties {e.g., polypeptides), intercalators {e.g., acridine or psoralen), 
chelators, alkylators, and modified linkages {e.g., alpha anomeric nucleic acids). 

30 Also included are synthetic molecules that mimic nucleic acid molecules in the 
ability to bind to a designated sequences via hydrogen bonding and other chemical 
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interactions. Such molecules include, for example, those in which peptide linkages 
substitute for phosphate linkages in the backbone of the molecule. 

The invention also pertains to HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), and HDRP(ANLS) nucleic acid molecules that hybridize under 
5 high stringency hybridization conditions, such as for selective hybridization, to a 
nucleotide sequence described herein (e.g., nucleic acid molecules that specifically 
hybridize to a nucleotide sequence encoding polypeptides described herein, and, 
optionally, have an activity of the polypeptide). In one embodiment, the invention 
includes variants described herein that hybridize under high stringency hybridization 

10 conditions (e.g., for selective hybridization) to a nucleotide sequence comprising a 
nucleotide sequence selected from SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, 
SEQ H> NO: 7, SEQ ID NO: 9 and the complement of SEQ ID NO: 1, SEQ ID NO: 
3, SEQ ID NO: 5, SEQ D NO: 7, or SEQ ID NO: 9. In another embodiment, the 
invention includes variants described herein that hybridize under high stringency 

15 hybridization conditions (e.g., for selective hybridization) to a nucleotide sequence 
encoding an amino acid sequence of SEQ ID NO: 2 (HDAC9), SEQ ID NO: 4 
(HDAC9a), SEQ ID NO: 6 (HDAC9(ANLS)), SEQ ID NO: 8 (HDAC9a(ANLS)), or 
SEQ ID NO: 10 (HDRP(ANLS)). In a preferred embodiment, the variant that 
hybridizes under high stringency hybridizations encodes a polypeptide that has a 

20 biological activity of an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide (e.g., histone deacetylase activity or transcription 
repression activity). 

Such nucleic acid molecules can be detected and/or isolated by specific 
hybridization (e.g., under high stringency conditions). "Specific hybridization," as 

25 used herein, refers to the ability of a first nucleic acid to hybridize to a second 
nucleic acid in a manner such that the first nucleic acid does not hybridize to any 
nucleic acid other than to the second nucleic acid (e.g, when the first nucleic acid 
has a higher similarity to the second nucleic acid than to any other nucleic acid in a 
sample wherein the hybridization is to be performed). "Stringency conditions" for 

30 hybridization is a term of art that refers to the incubation and wash conditions, e.g. , 
conditions of temperature and buffer concentration, that permit hybridization of a 
particular nucleic acid to a second nucleic acid; the first nucleic acid may be 
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perfectly (i.e., 100%) complementary to the second, or the first and second may 
share some degree of complementarity that is less than perfect {e.g., 70%, 75%, 
85%, 95%). For example, certain high stringency conditions can be used that 
distinguish perfectly complementary nucleic acids from those of less 
5 complementarity. "High stringency conditions," "moderate stringency conditions," 
and "low stringency conditions" for nucleic acid hybridizations are explained on 
pages 2.10.1-2.10.16 and pages 6.3.1-6.3.6 in Current Protocols in Molecular 
Biology (See Ausubel et al, supra, the entire teachings of which are incorporated by 
reference herein). The exact conditions that determine the stringency of 

1 0 hybridization depend not only on ionic strength (e.g. , 0.2XSSC or 0.1XSSC), 
temperature (e.g., room temperature, 42°C or 68°C), and the concentration of 
destabilizing agents such as formamide or denaturing agents such as SDS, but also 
on factors such as the length of the nucleic acid sequence, base composition, percent 
mismatch between hybridizing sequences, and the frequency of occurrence of 

1 5 subsets of that sequence within other non-identical sequences. Thus, equivalent 
conditions can be determined by varying one or more of these parameters while 
maintaining a similar degree of identity or similarity between the two nucleic acid 
molecules. Typically, conditions are used such that sequences at least about 60%, at 
least about 70%, at least about 80%, at least about 90% or at least about 95% or 

20 more identical to each other remain hybridized to one another. By varying 

hybridization conditions from a level of stringency at which no hybridization occurs 
to a level at which hybridization is first observed, conditions that will allow a given 
sequence to hybridize (e.g., selectively) with the most similar sequences in the 
sample can be determined. 

25 Exemplary conditions are described in Krause and Aaronson, Methods in 

Enzymology, 200:546-556 (1991). Also, in, Ausubel, et al, supra, which describes 
the determination of washing conditions for moderate or low stringency conditions. 
Washing is the step in which conditions are usually set so as to determine a 
minimum level of complementarity of the hybrids. Generally, starting from the 

30 lowest temperature at which only homologous hybridization occurs, each °C by 
which the final wash temperature is reduced (holding SSC concentration constant) 
allows an increase by 1% in the maximum extent of mismatching among the 



WO 02/102984 



-30- 



PCT7US02/19051 



sequences that hybridize. Generally, doubling the concentration of SSC results in an 
increase in Tm of 17°C. Using these guidelines, the washing temperature can be 
determined empirically for high, moderate, or low stringency, depending on the level 
of mismatch sought. 

5 For example, a low stringency wash can comprise washing in a solution 

containing 0.2XSSC/0.1% SDS for 10 minutes at room temperature; a moderate 
stringency wash can comprise washing in a prewarmed solution (42°C) solution 
containing 0.2XSSC/0.1% SDS for 15 minutes at 42°C; and a high stringency wash 
can comprise washing in prewarmed (68°C) solution containing 0.1XSSC/0.1%SDS 

10 for 1 5 minutes at 68°C. Furthermore, washes can be performed repeatedly or 

sequentially to obtain a desired result as known in the art. Equivalent conditions can 
be determined by varying one or more of the parameters given as an example, as 
known in the art, while maintaining a similar degree of identity or similarity between 
the target nucleic acid molecule and the primer or probe used. 

15 To determine the percent homology or identity of two nucleic acid 

sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps 
can be introduced in the sequence of one polypeptide or nucleic acid molecule for 
optimal alignment with the other polypeptide or nucleic acid molecule). The amino 
acid residues or nucleotides at corresponding amino acid positions or nucleotide 

20 positions are then compared, as described above. 

The present invention also provides isolated HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), and HDRP(ANLS) nucleic acid molecules that * 
contain a fragment or portion that hybridizes under highly stringent conditions to a 
nucleotide sequence comprising a nucleotide sequence selected from SEQ ID NO: 1, 

25 SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, and the complement 
of any of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID 
NO: 9 and also provides isolated nucleic acid molecules that contain a fragment or 
portion that hybridizes under highly stringent conditions to a nucleotide sequence 
encoding an amino acid sequence selected from SEQ ID NO: 2, SEQ ID NO: 4, SEQ 

30 ID NO: 6, SEQ ID NO: 8, and SEQ ID NO: 10. The nucleic acid fragments of the 
invention are at least about 15, preferably, at least about 18, 20, 23, or 25 
nucleotides, and can be 30, 40, 50, 100, 200 or more nucleotides in length. Longer 
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fragments, for example, 30 or more nucleotides in length, that encode antigenic 
polypeptides described herein are particularly useful, such as for the generation of 
antibodies as described above. 

In a related aspect, the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), 
5 and HDRP(ANLS) nucleic acid fragments of the invention are used as probes or 
primers in assays such as those described herein. "Probes" or "primers" are 
oligonucleotides that hybridize in a base-specific manner to a complementary strand 
of nucleic acid molecules. Such probes and primers include polypeptide nucleic 
acids, as described in Nielsen et al 9 Science, 254, 1497-1500 (1991). As also used 

10 herein, the term "primer" in particular refers to a single-stranded oligonucleotide that 
acts as a point of initiation of template-directed DNA synthesis using well-known 
methods (e.g., PCR, LCR) including, but not limited to those described herein. 

Typically, a probe or primer comprises a region of nucleotide sequence that 
hybridizes to at least about 15, typically about 20-25, and more typically about 40, 

15 50 or 75, consecutive nucleotides of a nucleic acid molecule comprising a 

contiguous nucleotide sequence selected from: SEQ ID NO: 1, SEQ ID NO: 3, SEQ 
ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, the complement of any of SEQ ID NO: 1, 
SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, and a sequence 
encoding an amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, 

20 SEQ ID NO: 8, or SEQ ID NO: 10. 

In preferred embodiments, a probe or primer comprises 100 or fewer 
nucleotides, preferably, from 6 to 50 nucleotides, and more preferably, from 12 to 30 
nucleotides. In other embodiments, the probe or primer is at least 70% identical to 
the contiguous nucleotide sequence or to the complement of the contiguous 

25 nucleotide sequence, preferably, at least 80% identical, more preferably, at least 90% 
identical, even more preferably, at least 95% identical, or even capable of selectively 
hybridizing to the contiguous nucleotide sequence or to the complement of the 
contiguous nucleotide sequence. Often, the probe or primer further comprises a 
label, e.g., radioisotope, fluorescent compound, enzyme, or enzyme co-factor. 

30 The nucleic acid molecules of the invention such as those described above 

can be identified and isolated using standard molecular biology techniques and the 
sequence information provided in SEQ ID NO: 1, SEQ ID NO; 3, SEQ ID NO: 5, 
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SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, 
SEQ ID NO: 8, and /or SEQ ID NO: 10. For example, nucleic acid molecules can 
be amplified and isolated by the polymerase chain reaction using synthetic 
oligonucleotide primers designed based on one or more of the nucleic acid 
5 sequences provided above and/or the complement of those sequences. Or such 
nucleic acid molecules may be designed based on nucleotide sequences encoding 
one or more of the amino acid sequences provided in SEQ ID NO: 2, SEQ ID NO: 4, 
SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10. See generally PCR Technology: 
Principles and Applications for DNA Amplification (ed. H.A. Erlich, Freeman Press, 

10 NY, NY, (1992); PCR Protocols: A Guide to Methods and Applications (Eds. Innis 
et aL, Academic Press, San Diego, CA, (1990); Mattila et aL, Nucleic Acids Res., 
19: 4967 (1991); Eckert et aL, PCR Methods and Applications, 1: 17 (1991); PCR 
(eds. McPherson et aL, JRL Press, Oxford)); and U.S. Patent No. 4,683,202. The 
nucleic acid molecules can be amplified using cDNA, mRNA, or genomic DNA as a 

15 template, cloned into an appropriate vector and characterized by DNA sequence 
analysis. 

Other suitable amplification methods include the ligase chain reaction (LCR) 
(See Wu and Wallace, Genomics, 4:560 (1 989), Landegren et ah, Science, 241 : 1077 
(1988)), transcription amplification (Kwoh et aL, Proc. Natl Acad. Sci. USA, 

20 86:1 173 (1989)), and self-sustained sequence replication (See Guatelli et aL, Proc. 
Nat. Acad. Sci. USA, 87:1874 (1990)) and nucleic acid based sequence 
amplification (NASBA). The latter two amplification methods involve isothermal 
reactions based on isothermal transcription, that produce both single stranded RNA 
(ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio 

25 of about 30 or 100 to 1, respectively. 

The amplified DNA can be radiolabeled and used as a probe for screening a 
cDNA library derived from human cells, mRNA in zap express, ZDPLOX, or other 
suitable vector. Corresponding clones can be isolated, DNA can be obtained 
following in vivo excision, and the cloned insert can be sequenced in either or both 

30 orientations by art-recognized methods to identify the correct reading frame 

encoding a polypeptide of the appropriate molecular weight. For example, the direct 
analysis of the nucleotide sequence of nucleic acid molecules of the present 
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invention can be accomplished using well-known methods that are commercially 
available. See, for example, Sambrook et aL, Molecular Cloning, A Laboratory 
Manual (2nd Ed., CSHP, New York (1989)); Zyskind et al 9 Recombinant DNA 
Laboratory Manual, (Acad. Press, (1988)). Using these or similar methods, the 
5 polypeptide and the DNA encoding the polypeptide can be isolated, sequenced, and 
further characterized. 

Antisense nucleic acid molecules of the invention can be designed using the 
nucleotide sequences of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 
NO: 7, SEQ ID NO: 9 and/or the complement of any of SEQ ID NO: 1, SEQ ID NO: 

10 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9 and/or a portion of those 

sequences, and/or the complement of those portion or sequences, and/or a sequence 
encoding the amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, 
SEQ ID NO: 8, SEQ ID NO: 10, or encoding a portion of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10. Such antisense nucleic 

15 acid molecules can be constructed using chemical synthesis and enzymatic ligation 
reactions using procedures known in the art. For example, an antisense nucleic acid 
molecule (e.g., an antisense oligonucleotide) can be chemically synthesized using 
naturally occurring nucleotides or variously modified nucleotides designed to 
increase the biological stability of the molecules or to increase the physical stability 

20 of the duplex formed between the antisense and sense nucleic acids, e.g., 

phosphorothioate derivatives and acridine substituted nucleotides can be used. 
Alternatively, the antisense nucleic acid molecule can be produced biologically using 
an expression vector into which a nucleic acid molecule has been subcloned in an 
antisense orientation (ue. 9 RNA transcribed from the inserted nucleic acid molecule 

25 will be of an antisense orientation to a target nucleic acid of interest). 

In general, the isolated HDAC9, HDAC9a 9 HDAC9(ANLS) 9 HDAC9a(ANLS), 
and HDRP(ANLS) nucleic acid sequences of the invention can be used as molecular 
weight markers on Southern blots, and as chromosome markers that are labeled to 
map related gene positions. The nucleic acid sequences can also be used to compare 

30 with endogenous DNA sequences in patients to identify genetic disorders (e.g., a 
predisposition for or susceptibility to a cell proliferation disease, an apoptotic 
disease, or a cell differentiation disease), and as probes, such as to hybridize and 
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discover related DNA sequences or to subtract out known sequences from a sample. 
The nucleic acid molecules of the present invention can also be used as therapeutic 
agents. 

By a "cell proliferation disease" is meant a disease that is caused by or results 
5 in undesirably high levels of cell division, undesirably low levels of apoptosis, or 
both. For example, cancers such as lymphoma, leukemia, melanoma, ovarian 
cancer, breast cancer, pancreatic cancer, prostate cancer, colon cancer, and lung 
cancer are all examples of cell proliferation diseases. Myeloproliferative disorders, 
including polycythemia vera, essential thrombocythemia, agnogenic myeloid 

10 metaplasia, and chronic myelogenous leukemia are also cell proliferation diseases. 

By a "cell differentiation disease" is meant a disease that is caused by or 
results in undesirably low levels of cell differentiation, or by undesirably high levels 
of cell differentiation. For example, cancers such as lymphoma, leukemia, 
melanoma, ovarian cancer, breast cancer, pancreatic cancer, prostate cancer, colon 

1 5 cancer, and lung cancer are all examples of cell differentiation diseases. 
Myeloproliferative disorders, including polycythemia vera, essential 
thrombocythemia, agnogenic myeloid metaplasia, and chronic myelogenous 
leukemia are also cell differentiation diseases. 

By an "apoptotic disease" is meant a condition in which the apoptotic 

20 response is abnormal. This may pertain to a cell or a population of cells that does 
not undergo cell death under appropriate conditions. For example, normally a cell 
will die upon exposure to apoptotic-triggering agents, such as chemotherapeutic 
agents, or ionizing radiation. When, however, a subject has an apoptotic disease, for 
example, cancer, the cell or a population of cells may not undergo cell death in 

25 response to contact with apoptotic-triggering agents. In addition, a subject may have 
an apoptotic disease when the occurrence of cell death is too low, for example, when 
the number of proliferating cells exceeds the number of cells undergoing cell death, 
as occurs in cancer when such cells do not properly differentiate. 

An apoptotic disease may also be a condition characterized by the occurrence 

30 of undesirably high levels of apoptosis. For example, certain neurodegenerative 
diseases, including but not limited to Alzheimer's disease, Parkinson's disease, 
amyotrophic lateral sclerosis, multiple sclerosis, restenosis, stroke, and ischemic 



WO 02/102984 



-35- 



PCT/US02/19051 



brain injury are apoptotic diseases in which neuronal cells undergo undesired cell 
death. 

Other diseases for which the polypeptides and nucleic acid molecules of the 
present invention may be useful for diagnosing and/or treating include, but are not 
5 limited to Huntington's disease. 

The HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), and 
HDRP(ANLS) nucleic acid molecules of the present invention can further be used to 
derive primers for genetic fingerprinting, to raise anti-polypeptide antibodies using 
DNA immunization techniques, and as an antigen to raise anti-DNA antibodies or 
10 elicit immune responses. Portions or fragments of the nucleotide sequences 

identified herein (and the corresponding complete gene sequences) can be used in 
numerous ways as polynucleotide reagents. For example, these sequences can be 
used to: (i) map their respective genes on a chromosome; and, thus, locate gene 
regions associated with genetic disease; (ii) identify an individual from a minute 
1 5 biological sample (tissue typing); and (iii) aid in forensic identification of a 
biological sample. 

In addition, the HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a( ANLS), and 
HDRP(ANLS) nucleotide sequences of the invention can be used to identify and 
express recombinant polypeptides for analysis, characterization, or therapeutic use, 

20 or as markers for tissues in which the corresponding polypeptide is expressed, either 
constitutively, during tissue differentiation, or in diseased states. The nucleic acid 
sequences can additionally be used as reagents in the screening and/or diagnostic 
assays described herein, and can also be included as components of kits (e.g., 
reagent kits) for use in the screening and/or diagnostic assays described herein. 

25 Standard techniques, such as the polymerase chain reaction (PCR) and DNA 

hybridization, may be used to clone HDAC9, HDAC9a 9 HDAC9(ANLS), 
HDA C9a(ANLS), or HDRP(ANLS) homologs in other species, for example, 
mammalian homologs. HDAC9, HDAC9a, HDA C9(ANLS) 9 HDA C9a(ANLS), or 
HDRP(ANLS) homologs maybe readily identified using low-stringency DNA 

30 hybridization or low-stringency PCR with human HDAC9, HDAC9a, 

HDAC9(ANLS), HDAC9a(ANLS), oxHDRP(ANLS) probes or primers. Degenerate 
primers encoding human HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
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HDRP(ANLS) polypeptides may be used to clone HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) homologs by RT-PCR. 

Alternatively, additional HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), oxHDRP(ANLS) homologs can be identified by utilizing 
5 consensus sequence information for HDAC9, HDAC9a, HDAC9(ANLS), 

HDAC9a(ANLS), or HDRP(ANLS) polypeptides to search for similar polypeptides 
in other species. For example, polypeptide databases for other species can be 
searched for proteins with the HDAC domains described herein. Candidate 
polypeptides containing such a motif can then be tested for their HDAC9, HDAC9a, 
10 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) biological activities, using 
methods described herein. 

EXPRESSION OF THE NUCLEIC ACID MOLECULES OF THE INVENTION 

Another aspect of the invention pertains to nucleic acid constructs containing 

1 5 an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic 
acid molecule, for example, one selected from the group consisting of SEQ ID NO: 
1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, and the 
complement of any of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 
7, or SEQ ID NO: 9 (or portions thereof). Yet another aspect of the invention 

20 pertains to HDAC9, HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS), and HDRP(ANLS) 
nucleic acid constructs containing a nucleic acid molecule encoding the amino acid 
sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ 
ID NO: 10. The constructs comprise a vector (e.g., an expression vector) into which 
a sequence of the invention has been inserted in a sense or antisense orientation. 

25 As used herein, the term "vector" or "construct" refers to a nucleic acid 

molecule capable of transporting another nucleic acid to which it has been linked. 
One type of vector is a "plasmid," which refers to a circular double stranded DNA 
loop into which additional DNA segments can be ligated. Another type of vector is 
a viral vector, wherein additional DNA segments can be ligated into the viral 

30 genome. Certain vectors are capable of autonomous replication in a host cell into 
which they are introduced (e.g., bacterial vectors having abacterial origin of 
replication and episomal mammalian vectors). Other vectors (e.g., non-episomal 
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mammalian vectors) are integrated into the genome of a host cell upon introduction 
into the host cell, and thereby are replicated along with the host genome. Moreover, 
certain vectors, expression vectors, are capable of directing the expression of genes 
to which they are operably linked. In general, expression vectors of utility in 
5 recombinant DNA techniques are often in the form of plasmids. However, the 

invention is intended to include such other forms of expression vectors, such as viral 
vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated 
viruses) that serve equivalent functions. 

Preferred recombinant expression vectors of the invention comprise a nucleic 

10 acid molecule of the invention in a form suitable for expression of the nucleic acid 
molecule in a host cell. This means that the recombinant expression vectors include 
one or more regulatory sequences, selected on the basis of the host cells to be used 
for expression, which is operably linked to the nucleic acid sequence to be 
expressed. Within a recombinant expression vector, "operably linked" is intended to 

15 mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) 
in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro 
transcription/translation system or in a host cell when the vector is introduced into 
the host cell). The term "regulatory sequence" is intended to include promoters, 
enhancers and other expression control elements (e.g., polyadenylation signals). 

20 Such regulatory sequences are described, for example, in Goeddel, Gene Expression 
Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). 
Regulatory sequences include those that direct constitutive expression of a 
nucleotide sequence in many types of host cell and those that direct expression of the 
nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory 

25 sequences). 

It will be appreciated by those skilled in the art that the design of the 
expression vector can depend on such factors as the choice of the host cell to be 
transformed and the level of expression of polypeptide desired. The expression 
vectors of the invention can be introduced into host cells to thereby produce 
30 polypeptides, including fusion polypeptides, encoded by nucleic acid molecules as 
described herein. 
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The recombinant expression vectors of the invention can be designed for 
expression of a polypeptide of the invention in prokaryotic or eukaryotic cells, e.g., 
bacterial cells, such as E. coli, insect cells (using baculovirus expression vectors), 
yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, 
5 supra. Alternatively, the recombinant expression vector can be transcribed and 
translated in vitro, for example, using T7 promoter regulatory sequences and T7 
polymerase. 

Another aspect of the invention pertains to host cells into which a 
recombinant expression vector of the invention has been introduced. The terms 

10 "host cell" and "recombinant host cell" are used interchangeably herein. It is 

understood that such terms refer not only to the particular subject cell but also to the 
progeny or potential progeny of such a cell. Because certain modifications may 
occur in succeeding generations due to. either mutation or environmental influences, 
such progeny may not, in fact, be identical to the parent cell, but are still included 

1 5 within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaryotic cell. For example, a nucleic 
acid molecule of the invention can be expressed in bacterial cells (e.g., E. coli), 
insect cells, yeast, or mammalian cells (such as Chinese hamster ovary cells (CHO) 
or COS cells, human 293T cells, HeLa cells, NIH 3T3 cells, and mouse 

20 erythroleukemia (MEL) cells). Other suitable host cells are known to those skilled 
in the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via 
conventional transformation or transfection techniques. As used herein, the terms 
"transformation" and "transfection" are intended to refer to a variety of 

25 art-recognized techniques for introducing a foreign nucleic acid molecule (e.g., 
DNA) into a host cell, including calcium phosphate or calcium chloride 
co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 
electroporation. Suitable methods for transforming or transfecting host cells can be 
found in Sambrook, et ah (supra\ and other laboratory manuals. 

30 For stable transfection of mammalian cells, it is known that, depending upon 

the expression vector and transfection technique used, only a small fraction of cells 
may integrate the foreign DNA into their genome. In order to identify and select 
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these integrants, a gene that encodes a selectable marker (e.g., for resistance to 
antibiotics) is generally introduced into the host cells along with the gene of interest 
Preferred selectable markers include those that confer resistance to drugs, such as 
G418, hygromycin, or methotrexate. Nucleic acid molecules encoding a selectable 
5 marker can be introduced into a host cell on the same vector as the nucleic acid 
molecule of the invention or can be introduced on a separate vector. Cells stably 
transfected with the introduced nucleic acid molecule can be identified by drug 
selection {e.g., cells that have incorporated the selectable marker gene will survive, 
while the other cells die). 

10 A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 

culture, can be used to produce (i.e., express) a polypeptide of the invention. 
Accordingly, the invention further provides methods for producing a polypeptide 
using the host cells of the invention. In one embodiment, the method comprises 
culturing the host cell of invention (into which a recombinant expression vector 

15 encoding a polypeptide of the invention has been introduced) in a suitable medium 
such that the polypeptide is produced. In another embodiment, the method further 
comprises isolating the polypeptide from the medium or the host cell. 

The host cells of the invention can also be used to produce nonhuman 
transgenic animals. For example, in one embodiment, a host cell of the invention is 

20 a fertilized oocyte or an embryonic stem cell into which an HDAC9, HDAC9a 9 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic acid molecule of the 
invention has been introduced. Such host cells can then be used to create 
non-human transgenic animals in which exogenous nucleotide sequences have been 
introduced into the genome or homologous recombinant animals in which 

25 endogenous nucleotide sequences have been altered. Such animals are useful for 
studying the function and/or activity of the nucleotide sequence and polypeptide 
encoded by the sequence and for identifying and/or evaluating modulators of their 
activity. 

As used herein, a "transgenic animal" is a non-human animal, preferably, a 
30 mammal, more preferably, a rodent such as a rat or mouse, in which one or more of 
the cells of the animal includes a transgene. Other examples of transgenic animals 
include non-human primates, sheep, dogs, cows, goats, chickens, and amphibians. A 
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transgene is exogenous DNA that is integrated into the genome of a cell from which 
a transgenic animal develops and that remains in the genome of the mature animal, 
thereby directing the expression of an encoded gene product in one or more cell 
types or tissues of the transgenic animal. As used herein, a homologous 
5 recombinant animal" is a non-human animal, preferably, a mammal, more 

preferably, a mouse, in which an endogenous gene has been altered by homologous 
recombination between the endogenous gene and an exogenous DNA molecule 
introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to 
development of the animal. 

10 Methods for generating transgenic animals via embryo manipulation and 

microinjection, particularly animals such as mice, have become conventional in the 
art and are described, for example, in U.S. Patent Nos. 4,736,866 and 4,870,009, 
U.S. Patent No. 4,873,191, and in Hogan, Manipulating the Mouse Embryo (Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1986)). Methods for 

1 5 constructing homologous recombination vectors and homologous recombinant 
animals are described further in Bradley, Current Opinion in Bio/Technology, 
2:823-829 (1991) and in PCT Publication Nos. WO 90/1 1354, WO 91/01 140, WO 
92/0968, and WO 93/04169. Clones of the non-human transgenic animals described 
herein can also be produced according to the methods described in Wilmut et aL, 

20 Nature, 385:810-813 (1997) and PCT Publication Nos. WO 97/07668 and WO 
97/07669. 

ANTIBODIES OF THE INVENTION 

Polyclonal and/or monoclonal antibodies that selectively bind one form of an 

25 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

polypeptide but not another form of the polypeptide are also provided. Antibodies 
are also provided that bind a portion of either the variant or reference HD AC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide that 
contains the polymorphic site or sites. 

30 In another aspect, the invention provides antibodies to each of the HDAC9, 

HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), and HDRP(ANLS) polypeptides and 
polypeptide fragments of the invention, e.g., having an amino acid sequence encoded 
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by SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, 
or a portion thereof, or having an amino acid sequence encoded by a nucleic acid 
molecule comprising all or a portion of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 
5, SEQ ID NO: 7, or SEQ ID NO: 9, (e.g. 9 SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 
5 NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10, or another variant, or portion thereof). 
The term "purified antibody" as used herein refers to immunoglobulin 
molecules and immunologically active portions of immunoglobulin molecules, i.e., 
molecules that contain an antigen binding site that selectively binds an antigen. A 
molecule that selectively binds to a polypeptide of the invention is a molecule that 

10 binds to that polypeptide or a fragment thereof, but does not substantially bind other 
molecules in a sample, e.g., a biological sample that naturally contains the 
polypeptide. Preferably the antibody is at least 60%, by weight, free from proteins 
and naturally occurring organic molecules with which it naturally associated. More 
preferably, the antibody preparation is at least 75% or 90%, and most preferably, 

1 5 99%, by weight, antibody. Examples of immunologically active portions of 
immunoglobulin molecules include F(ab) and F(ab')2 fragments that can be 
generated by treating the antibody with an enzyme such as pepsin. 

The invention provides polyclonal and monoclonal antibodies that selectively 
bind to an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

20 polypeptide of the invention. The term "monoclonal antibody" or "monoclonal 
antibody composition," as used herein, refers to a population of antibody molecules 
that contain only one species of an antigen binding site capable of immunoreacting 
with a particular epitope of a polypeptide of the invention. A monoclonal antibody 
composition thus typically displays a single binding affinity for a particular 

25 polypeptide of the invention with which it immunoreacts. 

Polyclonal antibodies can be prepared as described above by immunizing a 
suitable subject with a desired immunogen, e.g., an HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide of the invention or 
fragment thereof. The antibody titer in the immunized subject can be monitored 

30 over time by standard techniques, such as with an enzyme linked immunosorbent 
assay (ELISA) using immobilized polypeptide. If desired, the antibody molecules 
directed against the polypeptide can be isolated from the mammal from the 
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blood) and further purified by well-known techniques, such as protein A 
chromatography to obtain the IgG fraction. 

At an appropriate time after immunization, e.g., when the antibody titers are 
highest, antibody-producing cells can be obtained from the subject and used to 
5 prepare monoclonal antibodies by standard techniques, such as the hybridoma 

technique originally described by Kohler and Milstein, Nature, 256:495-497 (1975), 
the human B cell hybridoma technique (Kozbor et aL, Immunol. Today, 4:72 
(1983)), the EBV-hybridoma technique (Cole et al. 9 Monoclonal Antibodies and 
Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 (1985)) or trioma techniques. The 

1 0 technology for producing hybridomas is well known (see generally Current Protocols 
in Immunology, Coligan et al, (eds.) John Wiley & Sons, Inc., New York, NY 
(1994)). Briefly, an immortal cell line (typically a myeloma) is fused to lymphocytes 
(typically splenocytes) from a mammal immunized with an immunogen as described 
above, and the culture supernatants of the resulting hybridoma cells are screened to 

1 5 identify a hybridoma producing a monoclonal antibody that binds a polypeptide of 
the invention. 

Any of the many well known protocols used for fusing lymphocytes and 
immortalized cell lines can be applied for the purpose of generating a monoclonal 
antibody to a polypeptide of the invention (see, e.g., Current Protocols in 

20 Immunology, supra; Galfre et al., (1977) Nature, 266:55052; R.H. Kenneth, in 
Monoclonal Antibodies: A New Dimension In Biological Analyses, Plenum 
Publishing Corp., New York, New York (1980); and Lerner, Yale J. Biol. Med., 
54:387-402 (1981)). Moreover, the ordinarily skilled worker will appreciate that 
there are many variations of such methods that also would be useful. 

25 Alternative to preparing monoclonal antibody-secreting hybridomas, a 

monoclonal antibody to an HDAC9, HD AC9a, HDAC9(ANLS), HDAC9a(ANLS), 
or HDRP(ANLS) polypeptide of the invention can be identified and isolated by 
screening a recombinant combinatorial immunoglobulin library (e.g., an antibody 
phage display library) with the polypeptide to thereby isolate immunoglobulin 

30 library members that bind the polypeptide. Kits for generating and screening phage 
display libraries are commercially available (e.g., the Pharmacia Recombinant Phage 
Antibody System, Catalog No. 27-9400-01; and the Stratagene SurfZAP™ Phage 
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Display Kit, Catalog No. 240612). Additionally, examples of methods and reagents 
particularly amenable for use in generating and screening antibody display library 
can be found in, for example, U.S. Patent No. 5,223,409; PCT Publication No. WO 
92/18619; PCT Publication No. WO 91/17271; PCT Publication No. WO 92/20791; 
5 PCT Publication No. WO 92/1 5679; PCT Publication No. WO 93/01288; PCT 
Publication No. WO 92/01047; PCT Publication No. WO 92/09690; PCT 
Publication No. WO 90/02809; Fuchs et aL 9 Bio/Technology, 9:1370-1372 (1991); 
Hay et ah, Hum. Antibod. Hybridomas, 3:81-85 (1992); Huse et al. 9 Science, 
246:1275-1281 (1989); and Griffiths etal, EMBO J., 12:725-734 (1993). 

1 0 Additionally, recombinant antibodies, such as chimeric and humanized 

monoclonal antibodies, comprising both human and non-human portions, which can 
be made using standard recombinant DNA techniques, are within the scope of the 
invention. Such chimeric and humanized monoclonal antibodies can be produced by 
recombinant DNA techniques known in the art. 

15 In general, antibodies of the invention (e.g., a monoclonal antibody) can be 

used to isolate an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide of the invention by standard techniques, such as affinity 
chromatography or immunoprecipitation. A polypeptide-specific antibody can 
facilitate the purification of natural polypeptide from cells and of recombinantly 

20 produced polypeptide expressed in host cells. Moreover, an antibody specific for an 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide of the invention can be used to detect the polypeptide (e.g., in a cellular 
lysate, cell supernatant, or tissue sample) in order to evaluate the abundance and 
pattern of expression of the polypeptide. 

25 The antibodies of the present invention can also be used diagnostically to 

monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for 
example, determine the efficacy of a given treatment regimen. Detection can be 
facilitated by coupling the antibody to a detectable substance. Examples of 
detectable substances include various enzymes, prosthetic groups, fluorescent 

30 materials, luminescent materials, bioluminescent materials, and radioactive 

materials. Examples of suitable enzymes include horseradish peroxidase, alkaline 
phosphatase, p-galactosidase, and acetylcholinesterase; examples of suitable 
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prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples 
of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride and 
phycoerythrin; an example of a luminescent material includes luminol; examples of 
5 bioluminescent materials include luciferase, luciferin, and aequorin, and examples of 
suitable radioactive material include 125 I, 131 I, 35 S, and 3 H. 

DIAGNOSTIC AND SCREENING ASSAYS OF THE INVENTION 

The present invention also pertains to diagnostic assays for assessing HDAC 

1 0 9 HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) gene expression, or 
for assessing activity of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptides of the invention. In one embodiment, the assays are 
used in the context of a biological sample (e.g., blood, serum, cells, tissue) to 
thereby determine whether an individual is afflicted with a cell proliferation disease, 

15 an apoptotic disease, or a cell differentiation disease, or is at risk for (has a 

predisposition for or a susceptibility to) developing a cell proliferation disease, an 
apoptotic disease, or a cell differentiation disease. The invention also provides for 
prognostic (or predictive) assays for determining whether an individual is 
susceptible to developing a cell proliferation disease, an apoptotic disease, or a cell 

20 differentiation disease. For example, mutations in the HDAC9, HDAC9a, 

HDAC9(ANLS) 9 HDA C9a(ANLS), or HDRP(ANLS) nucleic acid molecule can be 
assayed in a biological sample. Such assays can be used for prognostic or predictive 
purpose to thereby prophylactically treat an individual prior to the onset of 
symptoms associated with a cell proliferation disease, an apoptotic disease, or a cell 

25 differentiation disease. 

Another aspect of the invention pertains to assays for monitoring the 
influence of agents, or candidate compounds {e.g., drugs or other agents) on the 
nucleic acid molecule expression or biological activity of polypeptides of the 
invention, as well as to assays for identifying candidate compounds that bind to an 

30 HDAC9, HDAC9a polypeptide, an HDAC9(ANLS) polypeptide, an 

HDAC9a(ANLS) polypeptide, or an HDRP(ANLS) polypeptide. These and other 
assays and agents are described in further detail in the following sections. 
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DIAGNOSTIC ASSAYS 

HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS) 
nucleic acid molecules, probes, primers, polypeptides, and antibodies to an HDAC9, 
5 an HD AC9a protein, an HD AC9(ANLS) protein, an HDAC9a(ANLS) protein, or an 
HDRP(ANLS) protein can be used in methods of diagnosis of a susceptibility to, or 
likelihood of having a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease, as well as in kits useful for diagnosis of a susceptibility to a 
cell proliferation disease, an apoptotic disease, or a cell differentiation disease. 

10 In one embodiment of the invention, diagnosis of a decreased susceptibility 

to a cell proliferation disease, an apoptotic disease, or a cell differentiation disease is 
made by detecting a polymorphism in HDAC9, HDAC9a, HDAC9(ANLS) 9 
HDAC9a(ANLS) 9 or HDRP(ANLS). The polymorphism can be a mutation in 
HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), or HDRP(ANLS), such as the 

15 insertion or deletion of a single nucleotide, or of more than one nucleotide, resulting 
in a frame shift mutation; the change of at least one nucleotide, resulting in a change 
in the encoded amino acid; the change of at least one nucleotide, resulting in the 
generation of a premature stop codon; the deletion of several nucleotides, resulting 
in a deletion of one or more amino acids encoded by the nucleotides; the insertion of 

20 one or several nucleotides, such as by unequal recombination or gene conversion, 
resulting in an interruption of the coding sequence of the gene; duplication of all or a 
part of the gene; transposition of all or a part of the gene; or rearrangement of all or a 
part of the gene, or a change in the expression pattern of the various HDAC9 
isoforms. More than one such mutation may be present in a single nucleic acid 

25 molecule. 

Such sequence changes cause a mutation in the polypeptide encoded by 
HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS) 9 or HDRP(ANLS). For 
example, if the mutation is a frame shift mutation, the frame shift can result in a 
change in the encoded amino acids, and/or can result in the generation of a 
30 premature stop codon, causing generation of a truncated polypeptide. Alternatively, 
a polymorphism associated with a decreased susceptibility to a cell proliferation 
disease, an apoptotic disease, or a cell differentiation disease can be a synonymous 
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mutation in one or more nucleotides (i.e., a mutation that does not result in a change 
in the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide). Such a polymorphism may alter sites, affect the stability or transport 
of mRNA, or otherwise affect the transcription or translation of the nucleic acid 
5 molecule. HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
that has any of the mutations described above is referred to herein as a "mutant 
nucleic acid molecule." 

- In a first method of diagnosing a decreased susceptibility to a cell 
proliferation disease, an apoptotic disease, or a cell differentiation disease, 

10 hybridization methods, such as Southern analysis, Northern analysis, or in situ 
hybridizations, can be used (see Ausubel, et aL, supra). For example, a biological 
sample from a test subject (a 'test sample") of genomic DNA, RNA, or cDNA, is 
obtained from an individual suspected of having, being susceptible to or predisposed 
for, or carrying a defect for, a cell proliferation disease, an apoptotic disease, or a 

15 cell differentiation disease (the "test individual"). The individual can be an adult, 
child, or fetus. The test sample can be from any source that contains genomic DNA, 
such as a blood sample, sample of amniotic fluid, sample of cerebrospinal fluid, or 
tissue sample from skin, muscle, buccal or conjunctival mucosa, placenta, 
gastrointestinal tract, or other organs. A test sample of DNA from fetal cells or 

20 tissue can be obtained by appropriate methods, such as by amniocentesis or 

chorionic villus sampling. The DNA, RNA, or cDNA sample is then examined to 
determine whether a polymorphism in HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) is present, and/or to determine which variant(s) 
encoded by HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

25 is present. The presence of the polymorphism or variant(s) can be indicated by 
hybridization of the gene in the genomic DNA, RNA, or cDNA to a nucleic acid 
probe. A "nucleic acid probe," as used herein, can be a DNA probe or an RNA 
probe; the nucleic acid probe can contain at least one polymorphism in HDAC9, 
HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS), or HDRP/ANLS) or contains a nucleic 

30 acid encoding a particular variant of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS). The probe can be any of the nucleic acid 



WO 02/102984 



-47- 



PCT/US02/19051 



molecules described above (e.g., the entire nucleic acid molecule, a fragment, a 
vector comprising the gene, a probe, or primer, etc.). 

To diagnose a decreased susceptibility to a cell proliferation disease, an 
apoptotic disease, or a cell differentiation disease, a hybridization sample is formed 
5 by contacting the test sample containing HDAC9, HDAC9a, HDAC9(ANLS), 

HDAC9a(ANLS), or HDRP(ANLS), with at least one nucleic acid probe. A preferred 
probe for detecting mRNA or genomic DNA is a labeled nucleic acid probe capable 
of hybridizing to HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a( ANLS), or 
HDRP(ANLS) mRNA or genomic DNA sequences described herein. The nucleic 

10 acid probe can be, for example, a full-length nucleic acid molecule, or a portion 
thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250, or 500 
nucleotides in length and sufficient to specifically hybridize under stringent 
conditions to appropriate mRNA or genomic DNA. For example, the nucleic acid 
probe can be all or a portion of SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5, SEQ 

15 ID NO: 7, SEQ ID NO: 9, or the complement of SEQ ID NO: 1 or SEQ ID NO: 3, 
SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9; or can be a nucleic acid molecule 
encoding all or a portion of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID 
NO: 8, or SEQ ID NO: 10. Other suitable probes for use in the diagnostic assays of 
the invention are described above (see. e.g., probes and primers discussed under the 

20 heading, <c Nucleic Acids of the Invention"). 

The hybridization sample is maintained under conditions that are sufficient to 
allow specific hybridization of the nucleic acid probe to HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS). "Specific hybridization," as 
used herein, indicates exact hybridization (e.g., with no mismatches). Specific 

25 hybridization can be performed under high stringency conditions or moderate 

stringency conditions, for example, as described above. In a particularly preferred 
embodiment, the hybridization conditions for specific hybridization are high 
stringency. 

Specific hybridization, if present, is then detected using standard methods. If 
30 specific hybridization occurs between the nucleic acid probe and HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS) in the test sample, then HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) has the 



WO 02/102984 



-48- 



PCT/US02/19051 



polymorphism, or is the variant, that is present in the nucleic acid probe. More than 
one nucleic acid probe can also be used concurrently in this method. Specific 
hybridization of any one of the nucleic acid probes is indicative of a-polymorphism 
in HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS), or of the 
5 presence of a particular variant encoded by HDAC9, HDAC9a, HDAC9(ANLS) 9 
HDAC9a(ANLS), or HDRP(ANLS), and is therefore diagnostic for a decreased 
susceptibility to a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease. 

In Northern analysis (see Current Protocols in Molecular Biology, Ausubel, 

10 et al , supra), the hybridization methods described above are used to identify the 
presence of a polymorphism or of a particular variant, associated with a decreased 
susceptibility to a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease. For Northern analysis, a test sample of RNA is obtained 
from the individual by appropriate means. Specific hybridization of a nucleic acid 

15 probe, as described above, to RNA from the individual is indicative of a 
polymorphism in HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS) 9 or 
HDRP(ANLS), or of the presence of a particular variant encoded by HDAC9, 
HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), or HDRP(ANLS), and is therefore 
diagnostic for a decreased susceptibility to a cell proliferation disease, an apoptotic 

20 disease, or a cell differentiation disease. 

For representative examples of use of nucleic acid probes, see, for example, 
U.S. Patent Nos. 5,288,611 and 4,851,330. 

Alternatively, a peptide nucleic acid (PNA) probe can be used instead of a 
nucleic acid probe in the hybridization methods described above. PNA is a DNA 

25 mimic having a peptide-like, inorganic backbone, such as N-(2-aminoethyl)glycine 
units, with an organic base (A, G, C, T, or U) attached to the glycine nitrogen via a 
methylene carbonyl linker (see, for example, Nielsen et al 9 Bioconjugate Chemistry, 
5 (1994), American Chemical Society, p. 1 (1994)). The PNA probe can be 
designed to specifically hybridize to a gene having a polymorphism associated with 

30 a susceptibility to a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease. Hybridization of the PNA probe to HDAC9, HDAC9a 9 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) is diagnostic for a decreased 
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susceptibility to a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease. 

In another method of the invention, mutation analysis by restriction digestion 
can be used to detect a mutant nucleic acid molecule, or nucleic acid molecules 
5 containing a polymorphism^), if the mutation or polymorphism in the gene results 
in the creation or elimination of a restriction site. A test sample containing genomic 
DNA is obtained from the individual. Polymerase chain reaction (PCR) can be used 
to amplify HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS) 9 or HDRP(ANLS) 
(and, if necessary, the flanking sequences) in the test sample of genomic DNA from 

10 the test individual. RFLP analysis is conducted as described (see Current Protocols 
in Molecular Biology, supra). The digestion pattern of the relevant DNA fragment 
indicates the presence or absence of the mutation or polymorphism inHDAC9, 
HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS), and therefore 
indicates the presence or absence of this decreased susceptibility to a cell 

15 proliferation disease, an apoptotic disease, or a cell differentiation disease. 

Sequence analysis can also be used to detect specific polymorphisms in 
- HDAC9, HDAC9a 9 HDA C9(ANLS), HDAC9a(ANLS), or HDRP(ANLS). A test 
sample of DNA or RNA is obtained from the test individual. PCR or other 
appropriate methods can be used to amplify the nucleic acid molecule, and/or its 

20 flanking sequences, if desired. The sequence of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS) 9 or HDRP(ANLS), or HDRP(ANLS), or a fragment of the any of 
those nucleic acid molecules, or an HDAC9, HDAC9a 9 HDAC9(ANLS), 
HDAC9a(ANLS) 9 or HDRP(ANLS) cDNA, or a fragment of any of those cDNAs, or 
an HDAC9, HDAC9a, HDAC9(ANLS) 9 HDA C9a( ANLS), or HDRP(ANLS) mRNA, 

25 or a fragment of any of those mRNAs, is determined, using standard methods. The 
sequence of the above gene, gene fragment, cDNA, cDNA fragment, mRNA, or 
mRNA fragment is compared with the known nucleic acid sequence of the nucleic 
acid molecule, cDNA (e.g., SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 
NO: 7, SEQ ID NO: 9, or a nucleic acid sequence encoding the protein of SEQ ID 

30 NO: 2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, or a fragment 
thereof) or mRNA, as appropriate. The presence of a polymorphism in HDAC9, 
HDAC9a 9 HDAC9(ANLS)> HDAC9a(ANLS) 9 or HDRP(ANLS) indicates that the 
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individual has a decreased susceptibility to a cell proliferation disease, an apoptotic 
disease, or a cell differentiation disease. 

Allele-specific oligonucleotides can also be used to detect the presence of a 
polymorphism in HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or 
5 HDRP(ANLS), through the use of dot-blot hybridization of amplified 

oligonucleotides with allele-specific oligonucleotide (ASO) probes (see, for 
example, Saiki et ah, Nature (London) 324:163-166 (1986)). An "allele-specific 
oligonucleotide" (also referred to herein as an "allele-specific oligonucleotide 
probe") is an oligonucleotide of approximately 10-50 base pairs, preferably 

1 0 approximately 1 5-30 base pairs, that specifically hybridizes to HDA C9 9 HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS), and that contains a 
polymorphism associated with a decreased susceptibility to a cell proliferation 
disease, an apoptotic disease, or a cell differentiation disease. An allele-specific 
oligonucleotide probe that is specific for particular polymorphisms in HDAC9, 

15 HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) can be prepared, 
using standard methods (see Current Protocols in Molecular Biology, supra). 

To identify polymorphisms in the gene that are associated with a decreased 
susceptibility to a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease a test sample of DNA is obtained from the individual. PCR 

20 can be used to amplify all or a fragment of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS), and its flanking sequences. The DNA 
containing the amplified HDAC9, HDAC9a y HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) (or a fragment of any of those genes) is dot-blotted, using standard 
methods (see Current Protocols in Molecular Biology, supra), and the blot is 

25 contacted with the oligonucleotide probe. The presence of specific hybridization of 
the probe to the amplified HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or 
HDRP(ANLS) is then detected. Specific hybridization of an allele-specific 
oligonucleotide probe to DNA from the individual is indicative of a polymorphism 
in HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS), and is 

30 therefore indicative of a decreased susceptibility to a cell proliferation disease, an 
apoptotic disease, or a cell differentiation disease. 
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hi another embodiment, arrays of oligonucleotide probes that are 
complementary to target nucleic acid sequence segments from an individual, can be 
used to identify polymorphisms in HDAC9, HDAC9a, HDAC9(ANLS), 
HDA C9a(ANLS) 9 or HDRP(ANLS). For example, in one embodiment, an 
5 oligonucleotide array can be used. Oligonucleotide arrays typically comprise a 
plurality of different oligonucleotide probes that are coupled to a surface of a 
substrate in different known locations. These oligonucleotide arrays, also described 
as "GENECHIPS™," have been generally described in the art, for example, U.S. 
Patent No. 5,143,854 and PCT patent publication Nos. WO 90/15070 and 92/10092. 

1 0 These arrays can generally be produced using mechanical synthesis methods or light 
directed synthesis methods that incorporate a combination of photolithographic 
methods and solid phase oligonucleotide synthesis methods. See Fodor et al 9 
Science, 251:767-777 (1991), Pirrung etal, U.S. Patent No. 5,143,854; PCT 
Publication No. WO 90/15070; Fodor et al 9 PCT Publication No. WO 92/10092, 

15 and U.S. Patent No. 5,424,186, the entire teachings of each of which are 

incorporated by reference herein. Techniques for the synthesis of these arrays using 
mechanical synthesis methods are described in, e.g., U.S. Patent No. 5,384,261, the 
entire teachings of which are incorporated by reference herein. 

Once an oligonucleotide array is prepared, a nucleic acid of interest is 

20 hybridized to the array and scanned for polymorphisms. Hybridization and scanning 
are generally carried out by methods described herein and also in, e.g., Published 
PCT Application Nos. WO 92/10092 and WO 95/1 1995, and U.S. Patent No. 
5,424,1 86, the entire teachings of which are incorporated by reference herein. In 
brief, a target nucleic acid sequence that includes one or more previously identified 

25 polymorphic markers is amplified by well known amplification techniques, e.g., 
PCR. Typically, this involves the use of primer sequences that are complementary 
to the two strands of the target sequence both upstream and downstream from the 
polymorphism. Asymmetric PCR techniques may also be used. Amplified target, 
generally incorporating a label, is then hybridized with the array under appropriate 

30 conditions. Upon completion of hybridization and washing of the array, the array is 
scanned to determine the position on the array to which the target sequence 



WO 02/102984 ' PCTYUS02/19051 

hybridizes. The hybridization data obtained from the scan is typically in the form of 
fluorescence intensities as a function of location on the array. 

Although primarily described in terms of a single detection block, e.g., for 
detection of a single polymorphism, arrays can include multiple detection blocks, 
5 and thus be capable of analyzing multiple, specific polymorphisms. In alternate 
arrangements, it will generally be understood that detection blocks may be grouped 
within a single array or in multiple, separate arrays so that varying, optimal 
conditions may be used during the hybridization of the target to the array. For 
example, it may often be desirable to provide for the detection of those 

10 polymorphisms that fall within G-C rich stretches of a genomic sequence, separately 
from those falling in A-T rich segments. This allows for the separate optimization 
of hybridization conditions for each situation. 

Additional descriptions of the use of oligonucleotide arrays for detection of 
polymorphisms can be found, for example, in U.S. Patent Nos. 5,858,659 and 

1 5 5,837,832, the entire teachings of which are incorporated by reference herein. 

Other methods of nucleic acid analysis can be used to detect polymorphisms 
in HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) or 
variants encoded by HDAC9, HDAC9a 9 HDAC9(ANLS) 9 HDAC9a(ANLS), or 
HDRP(ANLS). Representative methods include direct manual sequencing (Church 

20 and Gilbert Proc. Natl. Acad. Sci. USA 81: 1991-1995, (1988); Sanger et al, Proc. 
Natl. Acad. Sci. 74: 5463-5467 (1977); Beavis et al. 9 U.S. Patent No. 5,288,644); 
automated fluorescent sequencing; single-stranded conformation polymorphism 
assays (SSCP); clamped denaturing gel electrophoresis (CDGE); denaturing gradient 
gel electrophoresis (DGGE) (Sheffield et ah, Proc. Natl Acad. Sci. USA 86: 

25 232-236 (1991)), mobility shift analysis (Orita et ah, Proc. Natl. Acad. Sci. USA 86: 
2766-2770 (1989)), restriction enzyme analysis (Flavell et al 9 Cell 15: 25 (1978); 
Geever, et aL, Proc, Natl Acad. Sci. USA 78: 5081 (1981)); heteroduplex analysis; 
chemical mismatch cleavage (CMC) (Cotton et al 9 Proc. Natl. Acad. Sci. USA 85: 
4397-4401 (1985)); RNase protection assays (Myers et aL, Science 230: 1242 

30 (1985)); use of polypeptides that recognize nucleotide mismatches, such as E. coli 
mutS protein; and allele-specific PCR. 
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In another embodiment of the invention, diagnosis of a susceptibility to a cell 
proliferation disease, an apoptotic disease, or a cell differentiation disease can also 
be made by examining the level of an HDAC9, HDAC9a, HDAC9(ANLS), 
HDA C9a(ANLS) > or HDRP(ANLS) nucleic acid, for example, using in situ 
5 hybridization techniques known to one skilled in the art, or by examining the level of 
expression, activity, and/or composition of an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide, by a variety of methods, including 
enzyme linked immunosorbent assays (ELIS As), Western blots, 
immunoprecipitations, immunohistochemistry, and immunofluorescence. A test 

10 sample from an individual is assessed for the presence of an alteration in the level of 
an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic 
acid or in the expression and/or an alteration in composition of the polypeptide 
encoded by HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS), 
or for the presence of a particular variant encoded by HDAC9, HDAC9a, 

15 HDAC9(ANLS) 9 HDAC9a(ANLS)> or HDRP(ANLS). An alteration in expression of a 
polypeptide encoded by HDAC9, HDAC9a, HDAC9(ANLS) i HDAC9a(ANLS), or 
HDRP(ANLS) can be r for example, an alteration in the quantitative polypeptide 
expression {i.e., the amount of polypeptide produced); an alteration in the 
composition of a polypeptide encoded by HDAC9, HDAC9a, HDA C9( ANLS), 

20 HDA C9a(ANLS), or HDRP(ANLS), or an alteration in the qualitative polypeptide 
expression (e.g., expression of a mutant HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide or variant thereof). In a preferred 
embodiment, diagnosis of a susceptibility to a cell proliferation disease, an apoptotic 
disease, or a cell differentiation disease is made by detecting a particular variant 

25 encoded by HDAC9, HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS), 
or a particular pattern of variants. Preferably, increased levels of HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) or increased expression or 
activity of an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, relative to a control sample, for example, a sample 

30 known not to be associated with a cell proliferation disease, an apoptotic disease, or 
a cell differentiation disease, indicates an increased susceptibility or likelihood that 
the individual has a cell proliferation disease, an apoptotic disease, or a cell 
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differentiation disease. Alternatively, decreased levels of HDAC9, HDAC9a 9 
HDAC9(ANLS) 9 HDAC9a(ANLS), or HDRP(ANLS) or decreased expression or 
activity of an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, relative to a control sample, for example, a sample 
5 known not to be associated with a cell proliferation disease, an apoptotic disease, or 
a cell differentiation disease, indicates a decreased susceptibility or likelihood that 
the individual has a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease. 

Both quantitative and qualitative alterations can also be present. An 

1 0 "alteration" or "modulation" in the polypeptide expression, activity, or composition, 
as used herein, refers to an alteration in expression or composition in a test sample, 
as compared with the expression or composition of HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide in a control 
sample. A control sample is a sample that corresponds to the test sample (e.g., is 

15 from the same type of cells), and is from an individual who is not affected by a cell 
proliferation disease, an apoptotic disease, or a cell differentiation disease. An 
alteration in the expression or composition of the polypeptide in the test sample, as 
compared with the control sample, is indicative of a decreased susceptibility to a cell 
proliferation disease, an apoptotic disease, or a cell differentiation disease. 

20 Similarly, the presence of one or more different variants in the test sample, or the 
presence of significantly different amounts of different variants in the test sample, as 
compared with the control sample, is indicative of a decreased susceptibility to a cell 
proliferation disease, an apoptotic disease, or a cell differentiation disease. 

It is understood that alterations or modulations in polypeptide expression or 

25 function can occur in varying degrees. For example, an alteration or modulation in 
expression can be an increase, for example, by at least 1.5-fold to 2-fold, at least 3- 
fold, or, at least 5-fold, relative to the control. Alternatively, the alteration or 
modulation in polypeptide expression can be a decrease, for example, by at least 
10%, at least 40%, 50%, or 75%, or by at least 90%, relative to the control. 

3 0 Various means of examining expression or composition of the HD AC9, 

HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide can be 
used, including spectroscopy, colorimetry, electrophoresis, isoelectric focusing, and 
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immunoassays (e.g., David et ah, U.S. Patent No. 4,376,1 10) such as 
immunoblotting (see also Ausubel et ah, supra; particularly chapter 10). For 
example, in one embodiment, an antibody capable of binding to the polypeptide 
(e.g., as described above), preferably an antibody with a detectable label, can be 
5 used. Antibodies can be polyclonal, or more preferably, monoclonal. An intact 
antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used. The term 
"labeled," with regard to the antibody, is intended to encompass direct labeling of 
the antibody by coupling (i.e., physically linking) a detectable substance to the 
antibody, as well as indirect labeling of the antibody by reacting it with another 

1 0 reagent that is directly labeled. An example of indirect labeling is detection of a 
primary antibody using a fluorescently labeled secondary antibody. 

Western blotting analysis, using an antibody as described above that 
specifically binds to a mutant HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide, or an antibody that specifically 

1 5 binds to a non-mutant HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, or an antibody that specifically binds to a particular 
variant encoded by HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS), can be used to identify the presence in a test sample of a particular 
variant of a polypeptide encoded by a polymorphic or mutant HDAC9, HDAC9a, 

20 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS), or the absence in a test sample 
of a particular variant or of a polypeptide encoded by a non-polymorphic or 
non-mutant gene. The presence of a polypeptide encoded by a polymorphic or 
mutant gene, or the absence of a polypeptide encoded by a non-polymorphic or 
non-mutant gene, is diagnostic for a decreased susceptibility to a cell proliferation 

25 disease, an apoptotic disease, or a cell differentiation disease, as is the presence (or 
absence) of particular variants encoded by the HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) nucleic acid molecule. 

In one embodiment of this method, the level or amount of HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide in a test 

30 sample is compared with the level or amount of the HDAC9, HDAC9a, 

HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide in a control 
sample. A level or amount of the polypeptide in the test sample that is higher or 
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lower than the level or amount of the polypeptide in the control sample, such that the 
difference is statistically significant, is indicative of an alteration in the expression of 
the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide, and is diagnostic for a decreased susceptibility to a cell proliferation 
5 disease, an apoptotic disease, or a cell differentiation disease. 

Alternatively, the composition of the HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide in a test sample is compared with 
the composition of the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide in a control sample. A difference in the composition of 

10 the polypeptide in the test sample, as compared with the composition of the 
polypeptide in the control sample (e.g, the presence of different variants), is 
diagnostic for a decreased susceptibility to a cell proliferation disease, an apoptotic 
disease, or a cell differentiation disease. In another embodiment, both the level or 
amount and the composition of the polypeptide can be assessed in the test sample 

1 5 and in the control sample. A difference in the amount or level of the polypeptide in 
the test sample, compared to the control sample; a difference in composition in the 
test sample, compared to the control sample; or both a difference in the amount or 
level, and a difference in the composition, is indicative of a decreased susceptibility 
to a cell proliferation disease, an apoptotic disease, or a cell differentiation disease. 

20 Kits (e.g. , reagent kits) useful in the methods of diagnosis comprise 

components useful in any of the methods described herein, including, for example, 
hybridization probes or primers as described herein (e.g., labeled probes or primers), 
reagents for detection of labeled molecules, restriction enzymes (e.g t , for RFLP 
analysis), allele-specific oligonucleotides, antibodies that bind to a mutant or to 

25 non-mutant (native) HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, means for amplification of nucleic acids comprising 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS), or means 
for analyzing the nucleic acid sequence of HDAC9, HDAC9a, HDAC9(ANLS) 9 
HDA C9a(ANLS) , or HDRP(ANLS), or for analyzing the amino acid sequence of an 

30 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide, etc. 
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SCREENING ASSAYS AND AGENTS IDENTIFIED THEREBY 

The invention provides methods (also referred to herein as "screening 
assays") for identifying the presence of a nucleotide that hybridizes to a nucleic acid 
of the invention, as well as for identifying the presence of a polypeptide encoded by 
5 a nucleic acid of the invention. In one embodiment, the presence (or absence) of a 
nucleic acid molecule of interest (e.g., a nucleic acid that has significant homology 
with a nucleic acid of HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS) 9 or 
HDRP(ANfLS)) in a sample can be assessed by contacting the sample with a nucleic 
acid comprising a nucleic acid of the invention {e.g., a nucleic acid having the 

10 sequence of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ 
ID NO: 9, which may optionally comprise at least one polymorphism, or the 
complement thereof, or a nucleic acid encoding an amino acid having the sequence 
of SEQ ID NO: 2, SEQ ID NO:4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ED NO: 
10, or a fragment or variant of such nucleic acids), under stringent conditions as 

15 described above, and then assessing the sample for the presence (or absence) of 
hybridization. In a preferred embodiment, high stringency conditions are conditions 
appropriate for selective hybridization. In another embodiment, a sample containing 
the nucleic acid molecule of interest is contacted with a nucleic acid containing a 
contiguous nucleotide sequence (e.g., a primer or a probe as described above) that is 

20 at least partially complementary to a part of the nucleic acid molecule of interest 
(e.g. 9 an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
nucleic acid), and the contacted sample is assessed for the presence or absence of 
hybridization. In a preferred embodiment, the nucleic acid containing a contiguous 
nucleotide sequence is completely complementary to a part of the nucleic acid 

25 molecule of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS) 9 or 
HDRP(ANLS). 

In any of the above embodiments, all or a portion of the nucleic acid of 
interest can be subjected to amplification prior to performing the hybridization. 

In another embodiment, the presence (or absence) of an HDAC9, HDAC9a, 
30 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, such as a 
polypeptide of the invention or a fragment or variant thereof, in a sample can be 
assessed by contacting the sample with an antibody that specifically binds to the 
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polypeptide of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) (e.g., an antibody such as those described above), and then assessing 
the sample for the presence (or absence) of binding of the antibody to the HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide. 
5 In another embodiment, the invention provides methods for identifying 

agents or compounds (e.g., fusion proteins, polypeptides, peptidomimetics, 
prodrugs, receptors, binding agents, antibodies, small molecules or other drugs, or 
ribozymes) that alter or modulate {e.g., increase or decrease) the activity of the 
polypeptides described herein, or that otherwise interact with the polypeptides 

10 herein. For example, such compounds can be compounds or agents that bind to 
polypeptides described herein (eg., HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), ° r HDRP(ANLS) substrates or agents); that have a stimulatory or 
inhibitory effect on, for example, activity of polypeptides of the invention; or that 
change (e.g., enhance or inhibit) the ability of the polypeptides of the invention to 

1 5 interact with HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 

HDRP(ANLS) binding agents; or that alter post-translational processing of the 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide (e.g., agents that alter proteolytic processing to direct the polypeptide 
from where it is normally synthesized to another location in the cell, such as the cell 

20 surface; or agents that alter proteolytic processing such that more polypeptide is 

released from the cell, etc). In one example, the binding agent is a cell proliferation 
disease binding agent, an apoptotic disease binding agent, or a cell differentiation 
disease binding agent. As used herein, by a "cell proliferation disease binding 
agent," an "apoptotic disease binding agent," or a "cell differentiation disease 

25 binding agent" is meant an agent as described herein that binds to a polypeptide of 
the present invention and modulates a cell proliferation disease, an apoptotic disease, 
or a cell differentiation disease. The modulation can be an increase or a decrease in 
the severity or progression of the disease. In addition, a cell proliferation disease 
binding agent, an apoptotic disease binding agent, or a cell differentiation disease 

30 binding agent includes an agent that binds to a polypeptide that is upstream (earlier) 
or downstream (later) of the cell signaling events mediated by a polypeptide of the 
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present invention, and thereby modulates the overall activity of the signaling 
pathway; in turn, the disease state is modulated. 

The candidate compound can cause an increase in the activity of the 
polypeptide. For example, the activity of the polypeptide can be increased by at least 
5 1.5-fold to 2-fold, at least 3-fold, or, at least 5-fold, relative to the control. 

Alternatively, the polypeptide activity can be a decrease, for example, by at least 
10%, at least 20%, 40%, 50%, or 75%, or by at least 90%, relative to the control. 

In one embodiment, the invention provides assays for screening candidate 
compounds or test agents to identify compounds that bind to or modulate the activity 

10 of polypeptides described herein (or biologically active portion(s) thereof), as well as 
agents identifiable by the assays. As used herein, a "candidate compound" or "test 
agent" is a chemical molecule, be it naturally-occurring or artificially-derived, and 
includes, for example, peptides, proteins, synthesized molecules, for example, 
synthetic organic molecules, nataally-occumng molecule, for example, naturally 

15 occurring organic molecules, nucleic acid molecules, and components thereof. 

In general, candidate compounds for uses in the present invention may be 
identified from large libraries of natural products or synthetic (or semi-synthetic) 
extracts or chemical libraries according to methods known in the art. Those skilled 
in the field of drug discovery and development will understand that the precise 

20 source of test extracts or compounds is not critical to the screening procedure(s) of 
the invention. Accordingly, virtually any number of chemical extracts or compounds 
can be screened using the exemplary methods described herein. Examples of such 
extracts or compounds include, but are not limited to, plant-, fungal-, prokaryotic- or 
animal-based extracts, fermentation broths, and synthetic compounds, as well as 

25 modification of existing compounds. Numerous methods are also available for 
generating random or directed synthesis (e.g., semi-synthesis or total synthesis) of 
any number of chemical compounds, including, but not limited to, saccharide-, 
lipid-, peptide-, and nucleic acid-based compounds. Synthetic compound libraries 
are commercially available, e.g., from Brandon Associates (Merrimack, NH) and 

30 Aldrich Chemical (Milwaukee, WT). Alternatively, libraries of natural compounds 
in the form of bacterial, fungal, plant, and animal extracts are commercially 
available from a number of sources, including Biotics (Sussex, UK), Xenova 
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(Slough, UK), Harbor Branch Oceangraphics Institute (Ft. Pierce, FL), and 
PharmaMar, U.S.A. (Cambridge, MA). In addition, natural and synthetically 
produced libraries are generated, if desired, according to methods known in the art, 
e.g., by standard extraction and fractionation methods. For example, candidate 
5 compounds can be obtained using any of the numerous approaches in combinatorial 
library methods known in the art, including: biological libraries; spatially 
addressable parallel solid phase or solution phase libraries; synthetic library methods 
requiring deconvolution; the "one-bead one-compound" library method; and 
synthetic library methods using affinity chromatography selection. The biological 

10 library approach is limited to polypeptide libraries, while the other four approaches 
are applicable to polypeptide, non-peptide oligomer or small molecule libraries of 
compounds (Lam, Anticancer Drug Des., 12: 145 (1997)). Furthermore, if desired, 
any library or compound is readily modified using standard chemical, physical, or 
biochemical methods. 

1 5 In addition, those skilled in the art of drug discovery and development 

readily understand that methods for dereplication (e.g., taxonomic dereplication, 
biological dereplication, and chemical dereplication, or any combination thereof) or 
the elimination of replicates or repeats of materials already known for their activities 
should be employed whenever possible. 

20 When a crude extract is found to modulate (i.e., stimulate or inhibit) the 

expression and/or activity of the nucleic acids and or polypeptides of the present 
invention, further fractionation of the positive lead extract is necessary to isolate 
chemical constituents responsible for the observed effect. Thus, the goal of the 
extraction, fractionation, and purification process is the careful characterization and 

25 identification of a chemical entity within the crude extract having an activity that 
stimulates or inhibits nucleic acid expression, polypeptide expression, or polypeptide 
biological activity. The same assays described herein for the detection of activities 
in mixtures of compounds can be used to purify the active component and to test 
derivatives thereof. Methods of fractionation and purification of such heterogenous 

30 extracts are known in the art. If desired, compounds shown to be useful agents for 
treatment are chemically modified according to methods known in the art. 
Compounds identified as being of therapeutic value may be subsequently analyzed 
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using animal models for diseases in which it is desirable to alter the activity or 
expression of the nucleic acids or polypeptides of the present invention. 

In one embodiment, to identify candidate compounds that alter the biological 
activity, for example, the enzymatic activity or transcriptional repression activity of 
5 an HDAC9, HD AC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide, a cell, tissue, cell lysate, tissue lysate, or solution containing or 
expressing an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide (e.g. 9 SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SE 
ID NO: 8, SEQ ID NO: 10, or another variant encoded by HDAC9, HDAC9a, 

1 0 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS))> or a fragment or derivative 
thereof (as described above), can be contacted with a candidate compound to be 
tested under conditions suitable for enzymatic reaction or transcriptional repression 
reaction, as described herein. 

Alternatively, the polypeptide can be contacted directly with the candidate 

15 compound to be tested. The level (amount) of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) biological activity is assessed (e.g., the level 
(amount) of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDKP(ANLS) biological activity is measured, either directly or indirectly), and is 
compared with the level of biological activity in a control (i.e., the level of activity 

20 of the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide or active fragment or derivative thereof in the absence of the candidate 
compound to be tested, or in the presence of the candidate compound vehicle only). 
If the level of the biological activity in the presence of the candidate compound 
differs, by an amount that is statistically significant, from the level of the biological 

25 activity in the absence of the candidate compound, or in the presence of the 

candidate compound vehicle only, then the candidate compound is a compound that 
alters the biological activity of an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide. For example, an increase in the 
level of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

30 enzymatic or transcriptional repression activity relative to a control, indicates that 
the candidate compound is a compound that enhances (is an agonist of) HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) activity. Similarly, 
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a decrease in the enzymatic level or transcriptional repression level of HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) activity relative to a 
control, indicates that the candidate compound is a compound that inhibits (is an 
antagonist of) HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
5 HDRP(ANLS) activity, hi another embodiment, the level of biological activity of an 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide or derivative or fragment thereof in the presence of the candidate 
compound to be tested, is compared with a control level that has previously been 
established. A level of the biological activity in the presence of the candidate 
10 compound that differs from the control level by an amount that is statistically 

significant indicates that the compound alters HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) biological activity. 

The present invention also relates to an assay for identifying compounds that 
alter the expression of an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
15 HDRP(ANLS) nucleic acid molecule (e.g., antisense nucleic acids, fusion proteins, 
polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small 
molecules or other drugs, or ribozymes) that alter (e.g., increase or decrease) 
expression (e.g., transcription or translation) of the nucleic acid molecule or that 
otherwise interact with the nucleic acids described herein, as well as compounds 
20 identifiable by the assays. For example, a solution containing a nucleic acid 
encoding an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide can be contacted with a candidate compound to be tested. 
The solution can comprise, for example, cells containing the nucleic acid or cell 
lysate containing the nucleic acid; alternatively, the solution can be another solution 
25 that comprises elements necessary for transcription/translation of the nucleic acid. 
Cells not suspended in solution can also be employed, if desired. The level and/or 
pattern of HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), or HDRP(ANLS) 
expression (e.g., the level and/or pattern of mRNA or of protein expressed, such as 
the level and/or pattern of different variants) is assessed, and is compared with the 
30 level and/or pattern of expression in a control (i.e., the level and/or pattern of 

HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) expression in 
the absence of the candidate compound, or in the presence of the candidate 
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compound vehicle only). If the level and/or pattern in the presence of the candidate 
compound differs, by an amount or in a manner that is statistically significant, from 
the level and/or pattern in the absence of the candidate compound, or in the presence 
of the candidate compound vehicle only, then the candidate compound is a 
5 compound that alters the expression of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS) 9 or HDRP(ANLS). Enhancement of HDAC9, HDAC9a 9 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) expression indicates that the 
candidate compound is an agonist of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) activity. Similarly, inhibition of HDAC9, 

10 HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) expression indicates 
that the candidate compound is an antagonist of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) activity. In another embodiment, the level 
and/or pattern of an HDAC9, HDAC9a, HD AC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide(s) (e.g., different variants) in the presence of the 

15 candidate compound to be tested, is compared with a control level and/or pattern that 
has previously been established. A level and/or pattern in the presence of the 
candidate compound that differs from the control level and/or pattern by an amount 
or in a manner that is statistically significant indicates that the candidate compound 
alters HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS) 

20 expression. 

In another embodiment of the invention, compounds that alter the expression 
of an HDAC9, HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
nucleic acid molecule or that otherwise interact with the nucleic acids described 
herein, can be identified using a cell, cell lysate, or solution containing a nucleic 

25 acid encoding the promoter region of the HDAC9, HDAC9a> HDAC9(ANLS), 

HDAC9a(ANLS) 9 or HDRP(ANLS) gene operably linked to a reporter gene. After 
contact with a candidate compound to be tested, the level of expression of the 
reporter gene (e.g., the level of mRNA or of protein expressed) is assessed, and is 
compared with the level of expression in a control (i.e., the level of the expression 

30 of the reporter gene in the absence of the candidate compound, or in the presence of 
the candidate compound vehicle only). If the level in the presence of the candidate 
compound differs, by an amount or in a manner that is statistically significant, from 
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the level in the absence of the candidate compound, or in the presence of the 
candidate compound vehicle only, then the candidate compound is a compound that 
alters the expression of HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or 
HDRP(ANLS), as indicated by its ability to alter expression of a gene that is 
5 operably linked to the HDAC9, HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS) 9 or 
HDRP(ANLS) gene promoter. Enhancement of the expression of the reporter 
indicates that the compound is an agonist of HDAC9, HDAC9a> HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) activity. Similarly, inhibition of the expression 
of the reporter indicates that the compound is an antagonist of HDAC9, HDAC9a, 

10 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) activity. In another 

embodiment, the level of expression of the reporter in the presence of the candidate 
compound to be tested, is compared with a control level that has previously been 
established. A level in the presence of the candidate compound that differs from the 
control level by an amount or in a manner that is statistically significant indicates 

15 that the candidate compound alters HDAC9, HDAC9a> HDAC9(ANLS), 
HDAC9a(ANLS) 9 or HDRP(ANLS) expression. 

Compounds that alter the amounts of different variants encoded by HDAC9, 
HDAC9a 9 HDAC9(ANLS) 9 HDAC9a(ANLS) 9 or HDRP(ANLS) {e.g., a compound 
that enhances activity of a first variant, and that inhibits activity of a second variant), 

20 as well as compounds that are agonists of activity of a first variant and antagonists 
of activity of a second variant, can easily be identified using these methods 
described above. 

In other embodiments of the invention, assays can be used to assess the 
impact of a candidate compound on the activity of a polypeptide in relation to an 

25 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) substrate, 
for example, an inhibitor of histone deacetylase activity. These inhibitors fall into 
four general classes: 1) short-chain fatty acids (e.g 9 4-phenylbutyrate and valproic 
acid); 2) hydroxamic acids (e.g., SAHA, Pyroxamide, trichostatin A (TSA), 
oxamflatin and CHAPs, such as, CHAP1 and CHAP 31); 3) cyclic tetrapeptides 

30 (Trapoxin A, Apicidin and Depsipeptide (FK-228, also known as FR901 1228); 4) 
benzamides (e.g. 9 MS-275); and other compounds such as Scriptaid. Examples of 
such assays and compounds can be found in U.S. Patent Nos. 5,369,108, issued on 
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November 29, 1994, 5,700,81 1, issued on December 23, 1997, and 5,773,474, 
issued on June 30, 1998 to Breslow et al, U.S. Patent Nos. 5,055,608, issued on 
October 8, 1991, and 5,175,191, issued on December 29, 1992 to Marks et al, as 
well as, Yoshida et al, supra; Saito et al, supra; Furamai et al, supra; Komatsu et 
5 al, supra; Su et al, supra; Lee et al, supra and Suzuki et al supra, the entire 
content of all of which are hereby incorporated by reference. 

In one example, a cell or tissue that expresses or contains a compound that 
interacts with HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) (herein referred to as an "HDAC9, HDAC9a, HDAC9(ANLS), 

1 0 HDAC9a(ANLS), or HDRP(ANLS) substrate," which can be a polypeptide or other 
molecule that interacts with HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), 
or HDRP(ANLS)) is contacted with HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) in the presence of a candidate compound, and 
the ability of the candidate compound to alter the interaction between HDAC9, 

15 HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) and the HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP (ANLS) substrate is 
determined, for example, by assaying activity of the polypeptide. Alternatively, a 
cell lysate or a solution containing the HDAC9, HDAC9a, HDAC9(ANLS), 
HD AC9a(ANLS), or HDRP(ANLS) substrate, can be used. A compound that binds 

20 to HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) or the 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP (ANLS) substrate 
can alter the interaction by interfering with, or enhancing the ability of HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) to bind to, associate 
with, or otherwise interact with the HD AC9, HDAC9a, HDAC9(ANLS), 

25 HD AC9a(ANLS), or HDRP(ANLS) substrate. 

Determining the ability of the candidate compound to bind to HDAC9, 
HDAC9a, HDAC9(ANLS), HD AC9a(ANLS), or HDRP(ANLS) or an HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) substrate can be 
accomplished, for example, by coupling the candidate compound with a 

30 radioisotope or enzymatic label such that binding of the candidate compound to the 
polypeptide can be determined by detecting the labeled with 125 1, 35 S, 14 C, or 3 H, 
either directly or indirectly, and the radioisotope detected by direct counting of 
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radioemmission or by scintillation counting. Alternatively, candidate compound can 
be enzymatically labeled with, for example, horseradish peroxidase, alkaline 
phosphatase, or luciferase, and the enzymatic label detected by determination of 
conversion of an appropriate substrate to product. 
5 It is also within the scope of this invention to determine the ability of a 

candidate compound to interact with the polypeptide without the labeling of any of 
the interactants. For example, a microphysiometer can be used to detect the 
interaction of a candidate compound with HDAC9, HDAC9a, HDAC9(ANLS), 
HD AC9a(ANLS), or HDRP(ANLS) or an HDAC9, HDAC9a, HD AC9(ANLS), 

10 HDAC9a(ANLS), or HDRP(ANLS) substrate without the labeling of either the 
candidate compound, HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS), or the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) substrate (McConnell et al. 9 (1992) Science, 257: 1906-1912). As 
used herein, a "microphysiometer" (e.g., CYTOSENSOR™) is an analytical 

15 instrument that measures the rate at which a cell acidifies its environment using a 
light-addressable potentiometric sensor (LAPS). Changes in this acidification rate 
can be used as an indicator of the interaction between ligand and polypeptide. 

In another embodiment of the invention, assays can be used to identify 
polypeptides that interact with one or more HDAC9, HDAC9a, HDAC9(ANLS), 

20 HDAC9a(ANLS), or HDRP(ANLS) polypeptides, as described herein. For example, 
a yeast two-hybrid system such as that described by Fields and Song (Fields and 
Song, Nature 340: 245-246 (1989)) can be used to identify polypeptides that interact 
with one or more HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptides. In such a yeast two-hybrid system, vectors are 

25 constructed based on the flexibility of a transcription factor that has two functional 
domains (a DNA binding domain and a transcription activation domain). If the two 
domains are separated but fused to two different proteins that interact with one 
another, transcriptional activation can be achieved, and transcription of specific 
markers (e.g., nutritional markers such as His and Ade, or color markers such as 

30 lacZ) can be used to identify the presence of interaction and transcriptional 

activation. For example, in the methods of the invention, a first vector is used that 
includes a nucleic acid encoding a DNA binding domain and an HDAC9, HDAC9a, 
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HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, variant, or 
fragment or derivative thereof, and a second vector is used that includes a nucleic 
acid encoding a transcription activation domain and a nucleic acid encoding a 
polypeptide that potentially may interact with the HDAC9, HDAC9a, 
5 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, variant, or 
fragment or derivative thereof (e.g., an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide substrate or receptor). Incubation 
of yeast containing the first vector and the second vector under appropriate 
conditions (e.g. , mating conditions such as used in the MATCHMAKER™ system 

10 from Clontech) allows identification of colonies that express the markers of 

HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS). These 
colonies can be examined to identify the polypeptide(s) that interact with the 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide or fragment or derivative thereof Such polypeptides may be useful as 

1 5 compounds that alter the activity or expression of an HD AC9, HD AC9a, 

HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, as described 
above. 

In more than one embodiment of the above assay methods of the present 
invention, it may be desirable to immobilize an HDAC9, HDAC9a, 

20 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, or an HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) substrate, or other 
components of the assay on a solid support, in order to facilitate separation of 
complexed from uncomplexed forms of one or both of the polypeptides, as well as 
to accommodate automation of the assay. Binding of a candidate compound to the 

25 polypeptide, or interaction of the polypeptide with a substrate in the presence and 
absence of a candidate compound, can be accomplished in any vessel suitable for 
containing the reactants. Examples of such vessels include microtitre plates, test 
tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein (e.g., a 
glutathione-S-transferase fusion protein) can be provided that adds a domain that 

30 allows HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) or 
an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
substrate to be bound to a matrix or other solid support. 
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In another embodiment, modulators of expression of nucleic acid molecules 
of the invention are identified in a method wherein a cell, cell lysate, tissue, tissue 
lysate, or solution containing a nucleic acid encoding HDAC9, HDAC9a, 
HDAC9(ANLS), HD AC9a(ANLS), or HDRP(ANLS) is contacted with a candidate 
5 compound and the expression of appropriate mRNA or polypeptide (e.g., variant(s)) 
in the cell, cell lysate, tissue, or tissue lysate, or solution, is determined. The level 
of expression of appropriate mRNA or polypeptide(s) in the presence of the 
candidate compound is compared to the level of expression of mRNA or 
polypeptide(s) in the absence of the candidate compound, or in the presence of the 

10 candidate compound vehicle only. The candidate compound can then be identified 
as a modulator of expression based on this comparison. For example, when 
expression of mRNA or polypeptide is greater (statistically significantly greater) in 
the presence of the candidate compound than in its absence, the candidate 
compound is identified as a stimulator or enhancer of the mRNA or polypeptide 

1 5 expression. Alternatively, when expression of the mRNA or polypeptide is less 

(statistically significantly less) in the presence of the candidate compound than in its 
absence, the candidate compound is identified as an inhibitor of the mRNA or 
polypeptide expression. The level of mRNA or polypeptide expression in the cells 
can be determined by methods described herein for detecting mRNA or polypeptide. 

20 This invention further pertains to novel compounds identified by the 

above-described screening assays. Accordingly, it is within the scope of this 
invention to further use a compound identified as described herein in an appropriate 
animal model. For example, a compound identified as described herein (e.g., a 
candidate compound that is a modulating compound such as an antisense nucleic 

25 acid molecule, a specific antibody, or a polypeptide substrate) can be used in an 
animal model to determine the efficacy, toxicity, or side effects of treatment with 
such a compound. Alternatively, a compound identified as described herein can be 
used in an animal model to determine the mechanism of action of such a compound. 
Furthermore, this invention pertains to uses of novel compounds identified by the 

30 above-described screening assays for treatments as described herein. In addition, a 
compound identified as described herein can be used to alter activity of an HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, or to 
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alter expression of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS), by contacting the polypeptide or the nucleic acid molecule (or 
contacting a cell comprising the polypeptide or the nucleic acid molecule) with the 
compound identified as described herein. 

5 

PHARMACEUTICAL COMPOSITIONS 

The present invention also pertains to pharmaceutical compositions 
comprising nucleic acids described herein, particularly nucleotides encoding the 
polypeptides described herein; comprising polypeptides described herein (eg., SEQ 

10 ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO:10, and/or 
other variants encoded by HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or 
HDRP(ANLS)); and/or comprising a compound that alters (e.g., increases or 
decreases) HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), oxHDRP(ANLS) 
expression or HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 

15 HDRP(ANLS) polypeptide activity as described herein. For instance, a polypeptide, 
protein, fragment, fusion protein or prodrug thereof, or a nucleotide or nucleic acid 
construct (vector) comprising a nucleotide of the present invention, a compound that 
alters HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide activity, a compound that alters HDAC9, HDAC9a, HDAC9(ANLS), 

20 HDAC9a(ANLS), or HDRP(ANLS) nucleic acid expression, or an HDAC9, 

HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) substrate or binding 
partner, can be formulated with a physiologically acceptable carrier or excipient to 
prepare a pharmaceutical composition. The carrier and composition can be sterile. 
The formulation should suit the mode of administration. 

25 Suitable pharmaceutically acceptable carriers include but are not limited to 

water, salt solutions {e.g., NaCl), saline, buffered saline, alcohols, glycerol, ethanol, 
gum arabic, vegetable oils, benzyl alcohols, polyethylene glycols, gelatin, 
carbohydrates such as lactose, amylose or starch, dextrose, magnesium stearate, talc, 
silicic acid, viscous paraffin, perfume oil, fatty acid esters, hydroxymethylcellulose, 

30 polyvinyl pyrolidone, etc., as well as combinations thereof. The pharmaceutical 
preparations can, if desired, be mixed with auxiliary agents, e.g., lubricants, 
preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic 
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pressure, buffers, coloring, flavoring and/or aromatic substances and the like that do 
not deleteriously react with the active compounds. 

The composition, if desired, can also contain minor amounts of wetting or 
emulsifying agents, or pH buffering agents. The composition can be a liquid 
5 solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, 
or powder. The composition can be formulated as a suppository, with traditional 
binders and carriers such as triglycerides. Oral formulation can include standard 
carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, polyvinyl pyrollidone, sodium saccharine, cellulose, magnesium carbonate, 
10 etc. 

Methods of introduction of these compositions include, but are not limited 
to, intradermal, intramuscular, intraperitoneal, intraocular, intravenous, 
subcutaneous, topical, oral and intranasal. Other suitable methods of introduction 
can also include gene therapy (as described below), rechargeable or biodegradable 

15 devices, particle acceleration devises ("gene guns") and slow release polymeric 
devices. The pharmaceutical compositions of this invention can also be 
administered as part of a combinatorial therapy with other compounds. 

The composition can be formulated in accordance with the routine 
procedures as a pharmaceutical composition adapted for administration to human 

20 beings. For example, compositions for intravenous administration typically are 

solutions in sterile isotonic aqueous buffer. Where necessary, the composition may 
i also include a solubilizing agent and a local anesthetic to ease pain at the site of the 
injection. Generally, the ingredients are supplied either separately or mixed together 
in unit dosage form, for example, as a dry lyophilized powder or water free 

25 concentrate in a hermetically sealed container such as an ampule or sachette 
indicating the quantity of active compound. Where the composition is to be 
administered by infusion, it can be dispensed with an infusion bottle containing 
sterile pharmaceutical grade water, saline or dextrose/water. Where the composition 
is administered by injection, an ampule of sterile water for injection or saline can be 

30 provided so that the ingredients may be mixed prior to administration. 

For topical application, nonsprayable forms, viscous to semi-solid or solid 
forms comprising a carrier compatible with topical application and having a 
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dynamic viscosity preferably greater than water, can be employed. Suitable 
formulations include but are not limited to solutions, suspensions, emulsions, 
creams, ointments, powders, enemas, lotions, sols, liniments, salves, aerosols, etc., 
that are, if desired, sterilized or mixed with auxiliary agents, e.g., preservatives, 
5 stabilizers, wetting agents, buffers or salts for influencing osmotic pressure, etc. 
The compound may be incorporated into a cosmetic formulation. For topical 
application, also suitable are sprayable aerosol preparations wherein the active 
ingredient, preferably in combination with a solid or liquid inert carrier material, is 
packaged in a squeeze bottle or in admixture with a pressurized volatile, normally 

10 gaseous propellant, e.g., pressurized air. 

Compounds described herein can be formulated as neutral or salt fonns. 
Pharmaceutically acceptable salts include those formed with free amino groups such 
as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., 
and those formed with free carboxyl groups such as those derived from sodium, 

15 potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 
2-ethylamino ethanol, histidine, procaine, etc. 

The compounds are administered in a therapeutically effective amount. The 
amount of compounds that will be therapeutically effective in the treatment of a 
particular disorder or condition will depend on the nature of the disorder or 

20 condition, and can be determined by standard clinical techniques. In addition, in 
vitro or in vivo assays may optionally be employed to help identify optimal dosage 
ranges. The precise dose to be employed in the formulation will also depend on the 
route of administration, and the seriousness of the symptoms of a cell proliferation 
disease, an apoptotic disease, or a cell differentiation disease, and should be decided 

25 according to the judgment of a practitioner and each patient's circumstances. 
Effective doses maybe extrapolated from dose-response curves derived from in 
vitro or animal model test systems. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 

30 compositions of the invention. Optionally associated with such container(s) can be 
a notice in the form prescribed by a governmental agency regulating the 
manufacture, use or sale of pharmaceuticals or biological products, that notice 
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reflects approval by the agency of manufacture, use of sale for human 
administration. The pack or kit can be labeled with information regarding mode of 
administration, sequence of drug administration (e.g., separately, sequentially or 
concurrently), or the like. The pack or kit may also include means for reminding the 
5 patient to take the therapy. The pack or kit can be a single unit dosage of the 
combination therapy or it can be a plurality of unit dosages. In particular, the 
compounds can be separated, mixed together in any combination, present in a single 
vial or tablet. Compounds assembled in a blister pack or other dispensing means is 
preferred. For the purpose of this invention, unit dosage is intended to mean a 
10 dosage that is dependent on the individual pharmacodynamics of each compound 
and administered in FDA approved dosages in standard time courses. 



METHODS OF THERAPY 

The present invention also pertains to methods of treatment (prophylactic, 

15 diagnostic, and/or therapeutic) for a cell proliferation disease, an apoptotic disease, 
or a cell differentiation disease, using an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) therapeutic compound. An "HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) therapeutic 
compound" is a compound that alters (e.g., enhances or inhibits) HDAC9, HDAC9a, 

20 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide activity and/or 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic acid 
molecule expression, as described herein {e.g., an HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) agonist or antagonist). 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

25 therapeutic compounds can alter HDAC9, HDAC9a, HDAC9(ANLS), 

HDAC9a(ANLS), or HDRP(ANLS) polypeptide activity or nucleic acid molecule 
expression by a variety of means, such as, for example, by providing additional 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide or by upregulating the transcription or translation of the HDAC9, 

30 HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic acid 
molecule; by altering post-translationai processing of the HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide; by altering 
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transcription of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) variants; or by interfering with HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide activity {e.g., by binding to an 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
5 polypeptide), or by downregulating the transcription or translation of the HDAC9, 
HDAC9a, HDA C9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic acid 
molecule. Representative HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), 
or HDRP(ANLS) therapeutic compounds include the following: nucleic acids or 
fragments or derivatives thereof described herein, particularly nucleotides encoding 

10 the polypeptides described herein and vectors comprising such nucleic acids {e.g., a 
nucleic acid molecule, cDNA, and/or RNA, such as a nucleic acid encoding an 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide or active fragment or derivative thereof, or an oligonucleotide; for 
example, SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID 

15 NO: 9, which may optionally comprise at least one polymorphism, or a nucleic acid 
encoding SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID 
NO: 10, or fragments or derivatives thereof); polypeptides described herein {e.g, 
SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8 SEQ ID NO: 10 
and/or other variants encoded by HDAC9, HDAC9a, HDAC9(ANLS), 

20 HDAC9a(ANLS), or HDRP(ANLS), or fragments or derivatives thereof); HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) substrates; 
peptidomimetics; fusion proteins or prodrugs thereof; antibodies {e.g., an antibody 
to a mutant HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, or an antibody to a non-mutant HDAC9, HDAC9a, 

25 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide, or an antibody to 
a particular variant encoded by HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS), as described above); ribozymes; other small 
molecules; and other compounds that alter {e.g., enhance or inhibit) HDAC9, 
HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), oxHDRP(ANLS) nucleic acid 

30 expression or polypeptide activity, for example, those compounds identified in the 
screening methods described herein, or that regulate transcription of HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) variants {e.g., 
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compounds that affect which variants are expressed, or that affect the amount of 
each variant that is expressed. More than one HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) therapeutic compound can be used 
concurrently, if desired. 
5 The HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS) 3 or 

HDRP(ANLS) therapeutic compound that is a nucleic acid is used in the treatment 
of a cell proliferation disease, an apoptotic disease, or a cell differentiation disease. 
The term, "treatment" as used herein, refers not only to ameliorating symptoms 
associated with the disease, but also preventing or delaying the onset of the disease, 

10 and also lessening the severity or frequency of symptoms of the disease. The 

therapy is designed to alter (e.g., inhibit or enhance), replace or supplement activity 
of an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide in an individual. For example, an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) therapeutic compound can be administered in 

1 5 order to upregulate or increase the expression or availability of the HDA C9 9 

HDAC9a, HDAC9(MLS), HDAC9a(ANLS) 9 or HDRP(ANLS) nucleic acid molecule 
or of specific variants of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS), or, conversely, to downregulate or decrease the expression or 
availability of the HDAC9, HDAC9a 9 HDAC9(ANLS) 9 HDAC9a(ANLS), or 

20 HDRP(ANLS) nucleic acid molecule or specific variants of HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS). Upregulation or increasing 
expression or availability of a native HDAC9, HDAC9a 9 HDAC9(MLS) 9 
HDAC9a(ANLS) 9 or HDRP(ANLS) nucleic acid molecule or of a particular variant 
could interfere with or compensate for the expression or activity of a defective gene 

25 or another variant; downregulation or decreasing expression or availability of a 
native HDAC9, HDAC9a 9 HDAC9(ANLS) 9 HDAC9a(ANLS) 9 or HDRP(ANLS) 
nucleic acid molecule or of a particular variant could minimize the expression or 
activity of a defective gene or the particular variant and thereby minimize the impact 
of the defective gene or the particular variant. 

30 The HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 

HDRP(ANLS) therapeutic compound(s) are administered in a therapeutically 
effective amount (ie, an amount that is sufficient to treat the disease, such as by 
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ameliorating symptoms associated with the disease, preventing or delaying the onset 
of the disease, and/or also lessening the severity or frequency of symptoms of the 
disease). The amount that will be therapeutically effective in the treatment of a 
particular individuars disorder or condition will depend on the symptoms and 
5 severity of the disease, and can be determined by standard clinical techniques. In 
addition, in vitro or in vivo assays may optionally be employed to help identify 
optimal dosage ranges. The precise dose to be employed in the formulation will 
also depend on the route of administration, and the seriousness of the disease or 
disorder, and should be decided according to the judgment of a practitioner and each 

10 patient's circumstances. Effective doses may be extrapolated from dose-response 
curves derived from in vitro or animal model test systems. 

In one embodiment, a nucleic acid of the invention (e.g., a nucleic acid 
encoding an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDKP(ANLS) polypeptide, such as SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, 

1 5 SEQ ID NO: 7, or SEQ ID NO: 9, which may optionally comprise at least one 
polymorphism, or a nucleic acid that encodes an HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide or a variant, 
derivative or fragment thereof, such as a nucleic acid encoding the protein of SEQ 
ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10) can 

20 be used, either alone or in a pharmaceutical composition as described above. For 
example, HDAC9 9 HDAC9a 9 HDAC9(ANLS) 9 HDA C9a(ANLS) 9 or HDRP(ANLS) or 
a cDNA encoding an HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, either by itself or included within a vector, can be 
introduced into cells (either in vitro or in vivo) such that the cells produce native 

25 HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 

polypeptide. If desired, cells that have been transformed with the gene or cDNA or 
a vector comprising the gene or cDNA can be introduced (or re-introduced) into an 
individual affected with the disease. Thus, cells that, in nature, lack native HDAC9 9 
HDAC9a, HDAC9(£NLS) 9 HDAC9a(ANLS) 9 or HDRP(ANLS) expression and 

30 activity, or have mutant HDAC9, HDAC9a 9 HDAC9(ANLS) 9 HDAC9a(ANLS) 9 or 
HDRP(ANLS) expression and activity, or have expression of a disease-associated 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) variant, 
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can be engineered to express an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide or an active fragment of an 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptide (or a different variant of an HDAC9, HDAC9a, HDAC9(ANLS), 
5 HDAC9a(ANLS), or HDRP(ANLS) polypeptide). In a preferred embodiment, 

nucleic acid encoding the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide, or an active fragment or derivative thereof, can be 
introduced into an expression vector, such as a viral vector, and the vector can be 
introduced into appropriate cells in an animal. Other gene transfer systems, 
1 0 including viral and nonviral transfer systems, can be used. Alternatively, nonviral 
gene transfer methods, such as calcium phosphate coprecipitation, mechanical 
techniques (e.g., microinjection); membrane fusion-mediated transfer via liposomes; 
or direct DNA uptake, can also be used to introduce the desired nucleic acid 
molecule into a cell. 

1 5 Alternatively, in another embodiment of the invention, a nucleic acid of the 

invention; a nucleic acid complementary to a nucleic acid of the invention; or a 
portion of such a nucleic acid (e.g., an oligonucleotide as described below), can be 
used in "antisense" therapy, in which a nucleic acid (e.g., an oligonucleotide) that 
specifically hybridizes to the KNA and/or genomic DNA of HDAC9, HDAC9a, 

20 HDAC9(ANLS), HDAC9a(ANLS) 9 or HDRP(ANLS) is administered or generated in 
situ. The antisense nucleic acid that specifically hybridizes to the KNA and/or DNA 
inhibits expression of the HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or 
HDRP(ANLS) nucleic acid molecule, e.g., by inhibiting translation and/or 
transcription. Binding of the antisense nucleic acid can be by conventional base pair 

25 complementarity, or, for example, in the case of binding to DNA duplexes, through 
specific interaction in the major groove of the double helix. 

An antisense construct of the present invention can be delivered, for 
example, as an expression plasmid as described above. When the plasmid is 
transcribed in the cell, it produces RNA that is complementary to a portion of the 

30 mRNA and/or DNA that encodes an HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide. Alternatively, the antisense 
construct can be an oligonucleotide probe which is generated ex vivo and introduced 
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into cells; it then inhibits expression by hybridizing with the mRNA and/or genomic 
DNA of HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a( ANLS), or HDRP(ANLS). In 
one embodiment, the oligonucleotide probes are modified oligonucleotides that are 
resistant to endogenous nucleases, e.g. exonucleases and/or endonucleases, thereby 
5 rendering them stable in vivo. Exemplary nucleic acid molecules for use as 
antisense oligonucleotides are phosphoramidate, phosphothioate and 
methylphosphonate analogs of DNA (see also U.S. Patent Nos. 5,176,996; 
5,264,564; and 5,256,775). Additionally, general approaches to constructing 
oligomers useful in antisense therapy are also described, for example, by Van der 

10 Krol et al, Biotechniques 6: 958-976 (1988); and Stein et ah, Cancer Res 48: 
2659-2668 (1988). With respect to antisense DNA, ohgodeoxyribonucleotides 
derived from the translation initiation site, e.g. between the -10 and +10 regions of 
mHDAC9 9 HDAC9a 9 HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) nucleic 
acid sequence, are preferred. 

15 To perform antisense therapy, oligonucleotides (RNA, cDNA or DNA) are 

designed that are complementary to mRNA encoding an HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) polypeptide. The antisense 
oligonucleotides bind to HDAC9, HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or 
HDRP(ANLS) mRNA transcripts and prevent translation. Absolute 

20 complementarity, although preferred, is not required. A sequence "complementary" 
to a portion of an RNA, as referred to herein, indicates that a sequence has sufficient 
complementarity to be able to hybridize with the RNA, forming a stable duplex; in 
the case of double-stranded antisense nucleic acids, a single strand of the duplex 
DNA may thus be tested, or triplex formation may be assayed. The ability to 

25 hybridize will depend on both the degree of complementarity and the length of the 
antisense nucleic acid, as described in detail above. Generally, the longer the 
hybridizing nucleic acid, the more base mismatches with an RNA it may contain and 
still form a stable duplex (or triplex, as the case may be). One skilled in the art can 
ascertain a tolerable degree of mismatch by use of standard procedures. 

30 The oligonucleotides used in antisense therapy can be DNA, RNA, or 

chimeric mixtures or derivatives or modified versions thereof, single-stranded or 
double-stranded. The oligonucleotides can be modified at the base moiety, sugar 
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moiety, or phosphate backbone, for example, to improve stability of the molecule, 
hybridization, etc. The oligonucleotides can include other appended groups such as 
peptides (e.g. for targeting host cell receptors in vivo), or compounds facilitating 
transport across the cell membrane (see, e.g., Letsinger et aL, Proc. Natl. Acad. Sci. 
5 USA 86: 6553-6556 (1989); Lemaitre et ah, Proc. Natl. Acad Sci. USA 84: 648-652 
(1987); PCT International Publication No. W088/09810)) or the blood-brain banier 
(see, e.g., PCT International Publication No. W089/10134), or 
hybridization-triggered cleavage agents (see, e.g., Krol et ah, BioTechniques 6: 
958-976 (1988)) or intercalating agents. (See, e.g., Zon, Pharm. Res. 5: 539-549 

10 (1988)). To this end, the oligonucleotide may be conjugated to another molecule 
(e.g., a peptide, hybridization triggered cross-linking agent, transport agent, 
hybridization-triggered cleavage agent). 

The antisense molecules are delivered to cells that express HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) in vivo. A number of 

1 5 methods can be used for delivering antisense DNA or RNA to cells; e.g., antisense 
molecules can be injected directly into the tissue site, or modified antisense 
molecules, designed to target the desired cells (e.g., antisense linked to peptides or 
antibodies that specifically bind receptors or antigens expressed on the target cell 
surface) can be administered systematically. Alternatively, in a preferred 

20 embodiment, a recombinant DNA construct is utilized in which the antisense 

oligonucleotide is placed under the control of a strong promoter (e.g., pol HI or pol 
II). The use of such a construct to transfect target cells in the patient results in the 
transcription of sufficient amounts of single stranded RNAs that will form 
complementary base pairs with the endogenous HDAC9, HDAC9a, HDAC9(ANLS), 

25 HDA C9a(ANLS) 9 or HDRP(ANLS) transcripts and thereby prevent translation of the 
HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), or HDRP(ANLS) mRNA. 
For example, a vector can be introduced in vivo such that it is taken up by a cell and 
directs the transcription of an antisense RNA. Such a vector can remain episomal or 
become chromosomally integrated, as long as it can be transcribed to produce the 

30 desired antisense RNA. Such vectors can be constructed by recombinant DNA 
technology methods standard in the art and described above. For example, a 
plasmid, cosmid, YAC, or viral vector can be used to prepare the recombinant DNA 
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construct that can be introduced directly into the tissue site. Alternatively, viral 
vectors can be used that selectively infect the desired tissue, in which case 
administration may be accomplished by another route (e.g., systematically). 
Endogenous HDAC9, HDAC9a > HDAC9(ANLS) y HDAC9a(ANLS), or 
5 HDKP(ANLS) expression can also be reduced by inactivating or "knocking out" 
HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a (ANLS), or HDRP(ANLS) nucleic acid 
sequences or their promoters using targeted homologous recombination (e.g., see 
Smithies et ah, Nature 317: 230-234 (1985); Thomas and Capecchi, Cell 51 : 
503-512 (1987); Thompson et ah, Cell 5: 313-321 (1989)). For example, a mutant, 

1 0 non-functional HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) (or a completely unrelated DNA sequence) flanked by DNA 
homologous to the endogenous HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) (either the coding regions or regulatory regions 
of HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS)) can be 

1 5 used, with or without a selectable marker and/or a negative selectable marker, to 
transfect cells that express HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), or 
HDRP(ANLS) in vivo. Insertion of the DNA construct, via targeted homologous 
recombination, results in inactivation of HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS). The recombinant DNA constructs can be 

20 directly administered or targeted to the required site in vivo using appropriate 
vectors, as described above. Alternatively, expression of non-mutant HDAC9, 
HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) can be increased 
using a similar method: Targeted homologous recombination can be used to insert a 
DNA construct comprising a non-mutant, functional HDAC9, HDAC9a, 

25 HDAC9(ANLS), HDA C9a(ANLS), or HDRP(ANLS) (e.g, a gene having SEQ ID 
NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9, which 
may optionally comprise at least one polymorphism), or a portion thereof, in place 
of a mutant HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
in the cell, as described above. In another embodiment, targeted homologous 

30 recombination can be used to insert a DNA construct comprising a nucleic acid that 
encodes an HDAC9, HDAC9a, HD AC9(ANLS), HDAC9a(ANLS), or 
HDRP(ANLS) polypeptide variant that differs from that present in the cell. 
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Alternatively, endogenous HDAC9, HDAC9a 9 HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) expression can be reduced by targeting 
deoxyribonucleotide sequences complementary to the regulatory region of HDAC9, 
HDAC9a 9 HDAC9(ANLS)> HDAC9a(ANLS), or HDRP(ANLS) (z.e. a the HDAC9, 
5 HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) promoter and/or 
enhancers) to form triple helical structures that prevent transcription of HDAC9, 
HDAC9a, HDAC9(ANLS) 9 HDAC9a(ANLS), or HDRP(ANLS) in target cells in the 
body. (See generally, Helene Anticancer Drug Des., 6(6): 569-84 (1991); Helene et 
al., Ann, N.Y. Acad. Sci., 660: 27-36 (1992); andMaher, Bioassays 14(12): 807-15 

10 (1992)). Likewise, the antisense constructs described herein, by antagonizing the 
normal biological activity of one of the HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) proteins, can be used in the manipulation of 
tissue, e.g., tissue differentiation, both in vivo and for ex vivo tissue cultures. 
Furthermore, the antisense techniques (e.g., microinjection of antisense molecules, 

15 or transfection with plasmids whose transcripts are anti-sense with regard to an 
HDAC9, HDAC9a, HDAC9(ANLS), HDA C9a(ANLS), or HDRP(ANLS) mRNA or 
gene sequence) can be used to investigate role of HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) in developmental events, as 
well as the normal cellular function of HDAC9, HDAC9a, HDAC9(ANLS), 

20 HDAC9a(ANLS), or HDRP(ANLS) in adult tissue. Such techniques can be utilized 
in cell culture, but can also be used in the creation of transgenic animals. 

In yet another embodiment of the invention, other HDAC9, HDAC9a, 
HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) therapeutic compounds as 
described herein can also be used in the treatment or prevention of a cell 

25 proliferation disease, an apoptotic disease, or a cell differentiation disease. The 

therapeutic compounds can be delivered in a composition, as described above, or by 
themselves. They can be administered systemically, or can be targeted to a 
particular tissue. The therapeutic compounds can be produced by a variety of 
means, including chemical synthesis; recombinant production; in vivo production 

30 (e.g. , a transgenic animal, such as U.S. Patent No. 4,873,3 1 6 to Meade et al.\ for 
example, and can be isolated using standard means such as those described herein. 
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A combination of any of the above methods of treatment (e.g., 
administration of non-mutant HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptide in conjunction with antisense 
therapy targeting mutant HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 
5 HDRP(ANLS) mRNA; administration of a first variant encoded by HDA C9, 

HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) in conjunction with 
antisense therapy targeting a second encoded by HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS), can also be used. 

In another embodiment, the invention is directed to HDAC9, HDAC9a, 

10 HDAC9(ANLS) 9 HDAC9a(ANLS), or HDRP(ANLS) nucleic acid molecules and 
HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or HDRP(ANLS) 
polypeptides for use as a medicament in therapy. For example, the nucleic acid 
molecules or polypeptides of the present invention can be used in the treatment of a 
cell proliferation disease, an apoptotic disease, or a cell differentiation disease. In 

15 addition, the HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), or 

HDRP(ANLS) nucleic acid molecules and HDAC9, HDAC9a, HDAC9(ANLS), 
HDAC9a(ANLS), or HDRP(ANLS) polypeptides described herein can be used in 
the manufacture of a medicament for the treatment of a cell proliferation disease, an 
apoptotic disease, or a cell differentiation disease. 

20 The invention will be further described by the following non-limiting 

examples. The teachings of all publications cited herein are incorporated herein by 
reference in their entirety. 

EXEMPLIFICATION 
25 Cloning of cDNA encodes a novel HDAC, designated HDAC9 

HDAC9 was cloned by PCR and 3' rapid amplification of cDNA ends using 
primers designed from the sequence of human chromosome 7 whose translated 
product exhibited 80% identity to the HDAC domain of HDAC4, described in detail 
as follows. 

30 Database analyses indicate that HDRP is located on chromosome 7 (7pl5- 

p21). The human genome database (February 2001 release) of GenBank was 
searched using the human HDAC4 amino acid sequence. The TBLASTN program 
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was used to identify open reading frames downstream of HDRP on chromosome 7 
that exhibit significant homology to the HDAC domain of HDAC4. Several 
fragments whose translated products exhibit over 58% identity were retrieved. Two 
sense primers (OL486, 5 -CC ATGGAAACGGTACCCAGC AGGC-3' (SEQ ID NO: 
5 16) and OL487, 5 f -CACTCCATCGCTATGATGAAGGG-3 f (SEQ ID NO: 17)) and 
antisense primers (OL484, 5-AGTTCCCTTCATCATAGCGATGG-3' (SEQ ID 
NO: 18) and OL485, 5-AATGTACAGGATGCTGGGGT-3 1 (SEQ ID NO: 19)) 
each were designed based upon one of these fragments whose translated products 
matched amino acids 842-873 of HDAC4. RT-PCR was performed using each of 

10 the antisense primers and a sense primer 

(S'-CCCTTGTAGCTGGTGGAGTTCCCTT-S 1 (SEQ ID NO: 20)) from the coding 
region of HDRP and human brain cDNA as a template. PCR was performed in a 
Biometra TGRADEBNT Thermocycler for 30 cycles at 95°C for 20 seconds, 60°C 
for 20 seconds, and 72°C for 120 seconds. 

1 5 3'-rapid amplification of cDNA ends was performed using the sense primer 

OL486 and adaptor primer 1 (Clontech), and marathon-ready cDNA from human 
brain (Clontech, Palo Alto, CA) according to the manufacturer's instruction. The 
products were re-amplified using nested sense primer OL487 and adaptor primer 2 
(Clontech, Palo Alto, CA). PCR products were cloned into pGEM-T-easy vector 

20 (Promega, Madison, WI) and sequenced using an automated DNA sequencer at the 
DNA Sequencing Core Facility of the Memorial Sloan-Kettering Cancer Center, 
using DNA sequencing methods known to one of skill in the art. 

Two cDNAs were cloned from the above-described methods. One cDNA 
(SEQ ID NO: 1) encodes an HDAC9 protein that is 101 1 amino acids in length. The 

25 other cDNA (SEQ ID NO: 3) encodes an HDAC9a protein that is 879 amino acids 
long. The cDNA sequence and amino sequence of HDAC9 and HDAC9a are shown 
in FIGS. 1A-1G and FIGS. 2A-2B, respectively. Database analyses of these cDNAs 
against human genomic DNA sequences indicated that these two cDNAs are 
generated by alternatively splicing. An alignment of HDAC9, HDAC9a, HDRP, 

30 and HDAC4 is shown in FIGS. 3A-3C. 

Each of the HDAC9 and HDAC9a nucleic acid sequences were cloned into 
the pFLAG-CMV-5b vector (Sigma) in frame with the C-terminal FLAG tag. Only 
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the coding regions plus three extra base pairs (ACQ of cDNA of the HDAC9 and 
HDAC9a nucleic acid sequences were included in the constructs. These constructs 
are referred to herein as HDAC9-FLAG and HDAC9a-FLAG, respectively. These 
constructs are contained in E. coli, and can readily be expressed. For HDAC9, the 
5 insert is 3033 bp and for HDAC9a, the insert size is 2637 bp. Both HDAC9 and 
HDAC9a can be released with EcoRV and BamHI (whose sites have been 
incorporated in the primers to obtain HDAC9 and HDAC9a coding cDNA for 
cloning purpose) restriction enzyme digestion. 

The HDAC9 cDNA sequences from the known 5'-end of HDRP cDNA to the 

1 0 3-untranslated region cloned in this study cover over 5 1 1 kb of genomic DNA on 
chromosome 7. As shown in FIG. 4, the coding region cDNA of HDAC9 resides in 
23 exons spanning 458 kb of genomic sequence. Exons 21, 22, and 23 are one 
single exon in HDAC9a, but the middle exon that is numbered exon 22 in FIG. 4, 
containing an in-frame stop codon, is spliced out in HDAC9. In addition, exons 12 

1 5 and 1 3 are a single exon used by HDRP. Exon 1 3 is spliced as part of an intron in 
HDAC9 and HDAC9a. 

Further analysis revealed that exon 7, which contains a nuclear localization 
signal (NLS) is alternatively spliced in an HDRP isoform, creating HDRP(ANLS). 
RT-PCR analyses using primers based on sequences from exon 6 and exon 14 

20 indicate that this alternative splicing event also occurs in HDAC9 and/or HDAC9a. 
Thus, it is possible that at least 6 proteins can be generated from a single HDAC9 
gene by alternatively splicing of its RNA. The cDNA sequences and amino acid 
sequences for HDAC9, HDAC9a, HDAC9(ANLS), HDAC9a(ANLS), and 
HDRP(ANLS) are shown in FIGS. 1 A-10 and 2A-2E, respectively. 

25 

HDAC9 mRNA is differentially expressed among human tissues 

The expression of HDAC9 mRNA was determined by Northern blot analysis 
using a human multiple tissue Northern blot (Clontech, Palo Alto, CA). 
Hybridization was performed according to the manufacturer's instruction using 
30 ExPressHyb solution (Clontech, Palo Alto, CA). The 32 P-random priming labeled 
3-untranslated region common to both HDAC9 and HDAC9a that shares no 
significant sequence homology with HDRP was used as a probe. Two transcripts at 
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9.8 and 4.1 kb were detected in all tissues examined (FIG. 6A). The 4.1 kb 
transcript is shorter than the 4.4 kb HDRP transcript (See Zhou, et ah, Proc. Natl. 
Acad. Sci. USA, 97:1056-1061 (2000)). A third transcript at 1.2 kb was detected in 
placenta (FIG. 6A). Similar to HDRP (See Zhou, X., et aL 9 Proc. Natl. Acad. Sci. 
5 USA, 97:1056-1061 (2000)), high levels of HDAC9 transcripts were detected in 
brain and skeletal muscle (FIG. 6A). 

The distribution of alternatively spliced mRNA variants among tissues was 
examined by RT-PCR using primers (OL516 5'-TGTGTCATCGAGCTGGCTTC-3' 
(SEQ ID NO: 21) and OL517 5'-ATCTTCTGCAAGTGGCTCCA-3' (SEQ ID NO: 

10 22)) spanning the alternatively spliced exon 22 and cDNA panel from the same 
tissues as the multiple tissue Northern blot. PCR was performed in a Biometra 
TGRADIENT Thermocycler for 30 cycles at 95°C for 20 seconds, 60°C for 20 
seconds, and 72°C for 60 seconds. The expected sizes of PCR products were 680 
base pairs for HDAC9 and 993 base pairs for HDAC9a. The ratio of HDAC9 and 

15 HDAC9a transcripts differed among tissues (FIG. 6B). In the placenta and kidney, 
the levels of the two transcripts were about the same (FIG. 6B). In the brain, heart, 
and pancreas, there were more transcripts of HDAC9 than HDAC9a. Li the other 
tissues examined, there were more HDAC9a transcripts than HDAC9 transcripts 
(FIG. 6B). Under the conditions tested, HDAC9 transcripts were undetectable in 

20 liver (FIG. 6B). The lung had an HDAC9 product that was larger than expected and 
abundant. The lung also had low levels of HDAC9 transcripts and HDAC9a 
transcripts (FIG. 6B). An additional PCR product was also amplified from cDNA of 
the pancreas; this product was than the expected products from HDAC9 and 
HDAC9a (FIG. 6B). The identity of the different sized transcripts is unknown. 

25 

HDAC9 andHDAC9a possess histone deacetylase activity 

HDAC9 was named based on sequence homology to HDAC4 (FIGS. 3A- 

3C). To determine whether HDAC9 and HDAC9a possess HDAC activity, an 

HDAC enzymatic assay was performed using anti-FLAG immunoprecipitated 
30 HDAC9-FLAG and HDAC9a-FLAG. 

C-terminal FLAG-tagged HDAC9 (HDAC9-FLAG) and HDAC9a 

(HDAC9a-FLAG) expression vectors were constructed using the pFLAG-CMV-5b 
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vector (Sigma) and PCR amplified coding regions of HDAC9 and HDAC9a in 
frame with the FLAG-tag to form pFLAG-CMV-5b-HDAC9 (plasmid VR1) and 
pFLAG-CMV-5b-HDAC9a (plasmid VR2). All constructs were confirmed by DNA 
sequencing. 

5 Transfection of human kidney 293T cells, immunoprecipitation using anti- 

FLAG M2 Agarose (Sigma), Western blot analyses and dual luciferase assays were 
performed essentially as previously described by Zhou et ah (Proc. Natl. Acad. Sci. 
USA, 97:1056-1061 (2000)). Briefly, the cells (American Type Culture Collection) 
were cultured in DME HG medium (GDBCO/BRL) supplemented with 10% 

10 (vol/vol) FBS at 37 °C in a 5% C0 2 atmosphere. Transient transfection was 

performed by using Lipofectamine (GBCO/BRL) or Fugene 6 (Roche Molecular 
Biochemicals) according to the manufacturers* instructions. Cells were harvested 
24 to 48 hours after transfection and lysed in IP lysis buffer (50 mM Tris-HCl, pH 
7.5/120 mM NaCV5 mM EDTA/0.5% NP-40) at 5 x 10 7 cells per ml. 

15 Immunoprecipitation with anti-FLAG M2-agarose (Sigma, St. Louis, MO) was 
performed according to the manufacturer's instructions, hnmunoprecipitated 
proteins were released from the agarose beads by using FLAG-peptide and either 
used directly for HDAC enzymatic activity assays or resolved on SDS/PAGE for 
Western blot analyses. Anti-FLAG antibody was purchased from Sigma (St. Louis, 

20 MO). Western blot analyses were performed using standard methods. 

HDAC9 and HDAC9a enzymatic activity were assessed with the HDAC 
Fluorescent Activity Assay/Drug Discovery Kit-AK-500 (BIOMOL Research 
Laboratories) using a FLUOR DE LYS™ that contains an acetylated lysine side 
chain as a substrate and immunoprecipitated HDAC9-FLAG and HDAC9a-FLAG 

25 polypeptides according to the manufacturer's instruction and a SPECTRAmax® 
GEMINI XS microplate spectrofluorometer using the SOFTmax® PRO system 
(Molecular Devices) at excitation 355 nm and emission 460 nm with a cut off filter 
of 455 nm. Briefly, HDAC9-FLAG and HDAC9a-FLAG were incubated with the 
substrate overnight at room temperature in a 96-well plate. The reaction was 

30 stopped by addition of Fluor De Lys™ Developer and samples were read with the 
fluorometer. 
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As shown in FIG. 7, both HDAC9-FLAG and HDAC9a-FLAG deacetylated 
the acetylated lysine of FLUOR DE LYS™ and the activity of HDAC9 and 
HDAC9a was comparable. To examine the activity of HDAC9 and HDAC9a, 
inhibition studies using TSA were carried out by preincubating HDAC9-FLAG and 
5 HDAC9a-FLAG with TSA for 15 minutes at room temperature. The assay was then 
carried out as stated above. As shown in FIG. 7, TSA inhibited HDAC9 and 
HDAC9a deacetylase activity. The inset gel in FIG. 7 shows the amount of protein 
used in the assay. SAHA, a potent HDAC inhibitor (Richon et al., Proc. Natl. Acad. 
Sci. USA, 95:3003-3007 (1998)) also completely inhibited the histone deacetylase 

10 activity of HDAC9-FLAG and HDAC9a-FLAG. The HDAC activity of HDAC9 
and HDAC9a was about ten times lower than the deacetylase activity of HDAC4 
when comparable amount of protein was used under conditions tested here. 

HDAC9 and HDAC9a enzymatic activity was also determined through 
HDAC enzymatic assays using 3 H-histones isolated from murine erythroleukemia 

15 cells as a substrate. This assay was performed essentially as described by Richon et 
al (Proc. Natl. Acad. Sci. USA, 95:3003-3007 (1998)). Briefly, HDAC9-FLAG 
and HDAC9a-FLAG were incubated with 3 H-histones overnight at 37°C. The 
reaction was stopped by the addition of 1M HC1/0.1 acetic acid. Released 3 H-acetic 
acid was extracted with ethyl acetate and quantified by scintillation counting. For 

20 inhibition studies, the immunoprecipitated complexes were preincubated with the 
different HDAC inhibitors for 30 minutes at 4°C. 

As shown in FIG. 8, HDAC9a-FLAG deacetylated 3 H-acetyl-histones. 
SAHA, a potent HDAC inhibitor also completely inhibited the histone deacetylase 
activity of HDAC9a-FLAG. TSA also inhibited HDAC9a deacetylase activity. 

25 Similar results were obtained when HDAC9 was used as the enzyme source. 

HDAC9 andHDAC9a repress MEF2-mediated transcription 

The Xenopus homolog of HDRP, MTTR, was identified as a MEF2 
interacting transcriptional repressor (Sparrow et al, EMBO J. 18:5085-5098(1999)) 
30 and mouse HDRP also interacts with and represses MEF2 mediated transcription 
(Zhang et aL, J. Biol. Chem. 276:35-39 (2001)). We first tested whether HDAC9- 
FLAG and HDAC9a-FLAG interact with MEF2. 293 cells were transfected with 
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vector, HDAC9-FLAG, or HDAC9a-FLAG. The cells were subsequently lysed and 
HDAC9-FLAG and HDAC9a-FLAG proteins were immunoprecipitated with anti- 
FLAG antibodies. Western blot analysis of the immunoprecipitated proteins was 
carried out, using anti-MEF-2 antibody to probe the blot. As shown in FIG. 9 A, 
5 both HDAC9 and HDAC9a interacted with MEF2 in 293T cells. 

It was then determined whether HDAC9 and HDAC9a repress MEF2- 
mediated transcription. This determination was carried out as follows. The 
p3XMEF2-luciferase reporter gene (100 ng) and the vector pRL-TK (Promega) (5 
ng) were co-transfected into 293T cells in the absence (pcDNA3 empty vector) or 

10 presence of MEF2C (100 ng of pCMV-MEF2C). HDAC9-F (1 ng, 10 ng, or 100 ng 
of pFLAG-HDAC9; pFLAG-HDAC9 and HDAC9-FLAG are different constructs, 
with the FLAG sequence located at opposite ends of the HDAC9 nucleotide, but are 
functionally equivalent) or HDAC9a-F (1 ng, 10 ng, or 100 ng of pFLAG-HDAC9a; 
pFLAG-HDAC9a and HDAC9a-FLAG are different constructs, with the FLAG 

1 5 sequence located at opposite ends of the HDAC9a nucleotide, but are functionally 
equivalent) was included in a subset of experimental groups with the MEF2C 
vector. pFLAG empty vector was used to adjust the DNA to an equal amount in 
each transfection. The cells were harvested 24 to 36 hours after transfection and the 
luciferase activities were measured using the Dual-Luciferase™ Reporter Assay 

20 System from Promega according to the manufacturer's instruction. The firefly 
luciferase activity was first normalized to the co-transfected Renilla luciferase 
activity (encoded by the pRL-TK vector), and the luciferase activity value for cells 
transfected with MEF2C alone was set at 1. MEF2C activated transcription over 30 
times the basal level of transcription. As shown in FIG. 9B, HDAC9-FLAG and 

25 HDAC9a-FLAG repressed MEF2C mediated transcriptional activation in a dose- 
dependent manner and completely abolished the activation at the 100 ng dose for 
both HDAC9 and HDAC9a. The transcriptional repression effect of HDAC9 and 
HDAC9a on MEF2C mediated transcription was a specific effect since a co- 
transfected reporter gene for transfection efficiency containing a TK promoter was 

30 not repressed by HDAC9 or HDAC9a. 

Described herein is the identification and characterization of a new class II 
HDAC, designated HDAC9. HDAC9 has several alternatively spliced isoforms, 
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one of which is the previously identified HDRP (Zhou et al. 9 Proc. Natl. Acad. Sci. 
USA 97: 1056-1061 (2000)). HDAC9 and HDAC9a possess HDAC activity, which 
appears to have a lower specific enzymatic activity than HDAC4. While not 
wishing to be bound by any particular theory, it is possible that an essential co-factor 
5 is lost during immunoprecipitation or does not exist in 293T cells (for example, 
metastasis-associated protein 2 is essential for the assembly of a catalytically active 
HDAC1 (Zhang et aL, Genes Dev. 13:1924-1935 (1999)), the substrates used are 
not its natural substrate, or the FLAG tag which interferes with the folding of the 
protein. 

10 Searching the human genome with the HDAC domain from either HDAC1 

or HDAC9 identified a total of JO HDACs in the presently completed human 
genome sequence, a number of which are schematically represented in FIG. 10. 
HDACs 1, 2, 3, 8, 4, 5, 6, 7, 9, and 9a all have HDAC domains. HDRP, which is 
also schematically depicted in FIG. 10, does not have a catalytic domain. 

15 All references described herein are incorporated by reference in their 

entirety. While this invention has been particularly shown and described with 
reference to preferred embodiment thereof, it will be understood by those skilled in 
the art that various changes in form and details maybe made therein without 
departing from the spirit and scope of the invention as defined by the appended 

20 claims. 
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CLAMS 

What is claimed is: 

5 

1 . An isolated or recombinant histone deacetylase polypeptide, said polypeptide 
selected from: 

a) an isolated or recombinant polypeptide comprising SEQ ID NO: 2, 
SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10; 

10 and 

b) an isolated or recombinant polypeptide having at least 60% sequence 
identity with any one of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 
6, SEQ ID NO: 8, or SEQ ID NO: 10. 

1 5 2. The isolated or recombinant histone deacetylase polypeptide of Claim 1 , said 
polypeptide selected from: 

a) a polypeptide consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 
NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10. 

20 3. The isolated or recombinant histone deacetylase polypeptide of Claim 1, 
wherein said polypeptide is human. 

4. An isolated nucleic acid molecule selected from the group: 

a) an isolated nucleic acid comprising SEQ ID NO: 1, SEQ ID NO: 3, 
25 SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9; 

b) a complement of an isolated nucleic acid comprising SEQ ID NO: 1, 
SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, or SEQ ID NO: 9 

c) an isolated nucleic acid encoding a histone deacetylase polypeptide 
of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or 

30 SEQ ID NO: 10; 
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d) a complement of an isolated nucleic acid encoding a histone 
deacetylase polypeptide of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 
NO: 6, SEQ ID NO: 8, or SEQ ID NO: 10; 

e) a nucleic acid that is hybridizeable under high stringency conditions 
5 to a nucleic acid molecule that encodes any of SEQ ID NO: 2, SEQ 

ID NO: 4, SEQ ID NO: 6, or SEQ ID NO: 8, or a complement 
thereof; or 

f) a nucleic acid molecule that is hybridizeable under high stringency 
conditions to a nucleic acid comprising SEQ ID NO: 1, SEQ ID NO: 

10 3, SEQ ID NO: 5, or SEQ ID NO: 7; and 

g) an isolated nucleic acid molecule that has at least 55% sequence 
identity with any one of SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 
5, SEQ ID NO: 7, SEQ ID NO: 9, or a complement thereof. 

15 5 . . The isolated nucleic acid molecule of Claim 4, said nucleic acid molecule 

consisting of the nucleic acid molecule selected from the group consisting of 
SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, and SEQ ID 
NO: 9. 

20 6. The isolated nucleic acid molecule of Claim 4, wherein said nucleic acid 
molecule is human. 



7. A vector comprising the isolated nucleic acid molecule of Claim 4. 
25 8. A cell comprising the vector of Claim 7. 



9. A cell comprising the isolated nucleic acid molecule of Claim 4. 

10. A purified antibody that selectively binds a polypeptide of Claim 1 . 

30 

11. A method of identifying a compound that modulates expression of a nucleic 
acid molecule of Claim 4, said method comprising the steps of: 
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a) contacting said nucleic acid molecule with a candidate compound 
under conditions suitable for expression; and 

b) assessing the level of expression of said nucleic acid molecule, 
wherein a candidate compound that increases or decreases expression of said 

5 nucleic acid molecule relative to a control is a compound that modulates 

expression of said nucleic acid molecule. 

12. The method of Claim 11, wherein said method is carried out in a cell or 
animal. 

10 

1 3 . The method of Claim 1 1 , wherein said method is carried out in a cell free 
system. 

14. A method of identifying a compound that modulates the enzymatic activity 
15 of the polypeptide of Claim 1 , said method comprising the steps of: 

a) contacting said polypeptide with a candidate compound under 
conditions suitable for enzymatic reaction; and 

b) assessing the enzymatic activity level of said polypeptide, 
wherein a candidate compound that increases or decreases the enzymatic 

20 activity level of said polypeptide relative to a control is a compound that 

modulates the enzymatic activity of said polypeptide. 

15. The method of Claim 14, wherein said method is carried out in a cell or 
animal. 

25 

1 6. The method of Claim 14, wherein said method is carried out in a cell free 
system. 



17. 

30 



The method of Claim 14, wherein said polypeptide is further contacted with 
a substrate for the polypeptide, and wherein said substrate is selected from 
the group consisting of a cell proliferation disease binding agent, an 
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apoptotic disease binding agent, and a cell differentiation disease binding 
agent. 

18. The method of Claim 1 7, wherein said candidate compound is an inhibitor. 

5 

19. The method of Claim 17, wherein said candidate compound is an activator. 

20. A method of identifying a compound that modulates the transcriptional 
repression activity of the polypeptide of Claim 1, said method comprising 

10 the steps of: 

a) contacting said polypeptide with a candidate compound under 
conditions suitable for a transcriptional repression reaction; and 

b) assessing the transcriptional repression activity level of said 
polypeptide, 

1 5 wherein a candidate compound that increases or decreases the transcriptional 

repression activity level of said polypeptide relative to a control is a 
compound that modulates the transcriptional repression activity of said 
polypeptide. 

20 21. The method of Claim 20, wherein said method is carried out in a cell or 
animal. 



22. The method of Claim 20, wherein said method is carried out in a cell free 
system. 

25 

23. The method of Claim 20, wherein said polypeptide is further contacted with 
a substrate for the polypeptide, and wherein said substrate is selected from 
the group consisting of a cell proliferation disease binding agent, an 
apoptotic disease binding agent, and a cell differentiation disease binding 

30 agent. 



24. 



The method of Claim 23, wherein said candidate compound is an inhibitor. 
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25. The method of Claim 23, wherein said candidate compound is an activator. 

26. A method of identifying a compound that modulates expression of a nucleic 
acid molecule of Claim 4, said method comprising the steps of: 

5 a) providing a nucleic acid molecule comprising a promoter region of 

said nucleic acid of Claim 4 or part of a promoter region of said 
nucleic acid of Claim 4 operably linked to a reporter gene; 
b) contacting said nucleic acid molecule or with a candidate compound; 
and 

10 c) assessing the level of said reporter gene, 

wherein a candidate compound that increases or decreases expression of said 
reporter gene relative to a control is a compound that modulates expression 
of said nucleic acid molecule of Claim 4. 



1 5 27. The method of Claim 26, wherein said method is carried out in a cell. 



28. A method of identifying a polypeptide that interacts with a polypeptide of 
Claim 1 in a yeast two-hybrid system, said method comprising the steps of: 

a) providing a first nucleic acid vector comprising a nucleic acid 

20 molecule encoding a DNA binding domain and said polypeptide of 

Claim 1; 

b) providing a second nucleic acid vector comprising a nucleic acid 
encoding a transcription activation domain and a nucleic acid 
encoding a test polypeptide; 

25 c) contacting said first nucleic acid vector with said second nucleic acid 

vector in a yeast two-hybrid system; and 
d) assessing transcriptional activation in said yeast two-hybrid system, 
wherein an increase in transcriptional activation relative to a control 
indicates that the test polypeptide is a polypeptide that interacts with said 

3 0 polypeptide of Claim 1 . 



29. 



A pharmaceutical composition comprising a polypeptide of Claim 1. 
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30. A method of diagnosing a cell proliferation disease, an apoptotic disease, or 
a cell differentiation disease in a subject, said method comprising the steps 
of: 

a) obtaining a sample from said subject; and 
5 b) assessing the level of activity or expression of said polypeptide of 

Claim 1 in said sample, or detecting the level of said nucleic acid 

molecule of Claim 4, 
wherein if said level is increased relative to a control, then said subject has 
an increased likelihood of having a cell proliferation disease, an apoptotic 
10 disease, or a cell differentiation disease, and wherein if said level is 

decreased relative to a control, then said subject has a decreased likelihood 
of having a cell proliferation disease, an apoptotic disease, or a cell 
differentiation disease. 

15 31. The method of Claim 30, wherein said level of activity or expression of said 
polypeptide of Claim 1 in said sample is measured using 
immunohistochemical techniques. 

32. The method of Claim 30, wherein said level of said nucleic acid molecule of 
20 Claim 4 in said sample is measured using in situ hybridization techniques. 

33. A method of treating a cell proliferation disease, an apoptotic disease, or a 
cell differentiation disease, said method comprising administering a 
compound identified by the method of Claim 14. 

25 

34. A method of treating a cell proliferation disease, an apoptotic disease, or a 
cell differentiation disease, said method comprising administering a 
compound identified by the method of Claim 20. 
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SEQUENCE LISTING 

<110> Sloan-Kettering Institute for Cancer Research 
Richon, Victoria 
Zhou, Xianbo 
Rifkind, Richard A. 
Marks, Paul A. 

<120> HDAC9 Polypeptides and Polynucleotides 
and Uses Thereof 

<130> 3254.1000005 

<150> 60/298,173 
<151> 2001-06-14 

<150> 60/311,686 
<151> 2001-08-10 

<150> 60/316,995 
<151> 2001-09-04 

<160> 22 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 3186 
<212> DNA 

<213> Homo sapiens 
<400> 1 

ggggaagaga ggcacagaca cagataggag aagggcaccg gctggagcca cttgcaggac 60 
tgagggtttt tgcaacaaaa ccctagcagc ctgaagaact ctaagccaga tggggtggct 120 
ggacgagagc agctcttggc tcagcaaaga atgcacagta tgatcagctc agtggatgtg 180 
aagtcagaag ttcctgtggg cctggagccc atctcacctt tagacctaag gacagacctc 240 
aggatgatga tgcccgtggt ggaccctgtt gtccgtgaga agcaattgca gcaggaatta 300 
cttcttatcc agcagcagca acaaatccag aagcagcttc tgatagcaga gtttcagaaa 360 
cagcatgaga acttgacacg gcagcaccag gctcagcttc aggagcatat caaggaactt 420 
ctagccataa aacagcaaca agaactccta gaaaaggagc agaaactgga gcagcagagg 480 
caagaacagg aagtagagag gcatcgcaga gaacagcagc ttcctcctct cagaggcaaa 540 
gatagaggac gagaaagggc agtggcaagt acagaagtaa agcagaagct tcaagagttc 600 
ctactgagta aatcagcaac gaaagacact ccaactaatg gaaaaaatca ttccgtgagc 660 
cgccatccca agctctggta cacggctgcc caccacacat cattggatca aagctctcca 720 
ccccttagtg gaacatctcc atcctacaag tacacattac caggagcaca agatgcaaag 780 
gatgatttcc cccttcgaaa aactgcctct gagcccaact tgaaggtgcg gtccaggtta 840 
aaacagaaag tggcagagag gagaagcagc cccttactca ggcggaagga tggaaatgtt 900 
gtcacttcat tcaagaagcg aatgtttgag gtgacagaat cctcagtcag tagcagttct 960 
ccaggctctg gtcccagttc accaaacaat gggccaactg gaagtgttac tgaaaatgag 1020 
acttcggttt tgccccctac ccctcatgcc gagcaaatgg tttcacagca acgcattcta 1080 
attcatgaag attccatgaa cctgctaagt ctttatacct ctccttcttt gcccaacatt 1140 
accttggggc ttcccgcagt gccatcccag ctcaatgctt cgaattcact caaagaaaag 1200 
cagaagtgtg agacgcagac gcttaggcaa ggtgttcctc tgcctgggca gtatggaggc 1260 
agcatcccgg catcttccag ccaccctcat gttactttag agggaaagcc acccaacagc 1320 
agccaccagg ctctcctgca gcatttatta ttgaaagaac aaatgcgaca gcaaaagctt 1380 
cttgtagctg gtggagttcc cttacatcct cagtctccct tggcaacaaa agagagaatt 1440 
tcacctggca ttagaggtac ccacaaattg ccccgtcaca gacccctgaa ccgaacccag 1500 
tctgcacctt tgcctcagag cacgttggct cagctggtca ttcaacagca acaccagcaa 1560 
ttcttggaga agcagaagca ataccagcag cagatccaca tgaacaaact gctttcgaaa 1620 
tctattgaac aactgaagca accaggcagt caccttgagg aagcagagga agagcttcag 1680 
ggggaccagg cgatgcagga agacagagcg ccctctagtg gcaacagcac taggagcgac 1740 
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agcagtgctt gtgtggatga cacactggga 
ccagtggaca gtgatgaaga tgctcagatc 
tttatgcaac agcctttcct ggaacccacg 
ccgctggctg cggttggcat ggatggatta 
tcttcccctg ctgcctctgt tttacctcac 
tctgcaactg gaattgccta tgaccccttg 
tccaccaccc accctgagca tgctggacga 
actgggctgc taaataaatg tgagcgaatt 
cagcttgttc attctgaaca tcactcactg 
aagctggacc ccaggatact cctaggtgat 
tgtggtggac ttggggtgga cagtgacacc 
gcacgcatgg ctgttggctg tgtcatcgag 
aagaatgggt ttgctgttgt gaggccccct 
gggttctgct tttttaattc agttgcaatt 
ataagcaaga tattgattgt agatctggat 
ttttatgctg accccagcat cctgtacatt 
ttccctggca gtggagcccc aaatgaggtt 
aatattgcct ggacaggtgg ccttgatcct 
ttcaggacca tcgtgaagcc tgtggccaaa 
gctggatttg atgcattgga aggccacacc 
aaatgttttg gtcatttgac gaagcaattg 
gctctagaag gaggacatga tctcacagcc 
gcccttctag gaaatgagct ggagccactt 
atgaatgctg ttatttcttt acagaagatc 
tcttaa 



caagttgggg ctgtgaaggt caaggaggaa 1800 
caggaaatgg aatctgggga gcaggctgct 1860 
cacacacgtg cgctctctgt gcgccaagct 1920 
gagaaacacc gtctcgtctc caggactcac 1980 
ccagcaatgg accgccccct ccagcctggc 2040 
atgctgaaac accagtgcgt ttgtggcaat 2100 
atacagagta tctggtcacg actgcaagaa 2160 
caaggtcgaa aagccagcct ggaggaaata 2220 
ttgtatggca ccaaccccct ggacggacag 2280 
gactctcaaa agtttttttc ctcattacct 2340 
atttggaatg agctacactc gtccggtgct 2400 
ctggcttcca aagtggcctc aggagagctg 2460 
ggccatcacg ctgaagaatc cacagccatg 2520 
accgccaaat acttgagaga ccaactaaat 2580 
gttcaccatg gaaacggtac ccagcaggcc 2640 
tcactccatc gctatgatga agggaacttt 2700 
ggaacaggcc ttggagaagg gtacaatata 2760 
cccatgggag atgttgagta ccttgaagca 2820 
gagtttgatc cagacatggt cttagtatct 2880 
cctcctctag gagggtacaa agtgacggca 2940 
atgacattgg ctgatggacg tgtggtgttg 3 000 
atctgtgatg catcagaagc ctgtgtaaat 3060 
gcagaagata ttctccacca aagcccgaat 3120 
attgaaattc aaagtatgtc tttaaagttc 3180 

3186 



<210> 2 
<211> 1011 
<212> PRT 

<213> Homo sapiens 



<400> 2 






Met His 


Ser 


Met 


1 






Gly Leu 


Glu 


Pro 






20 


Met Met 


Pro 


Val 




35 




Glu Leu 


Leu 


Leu 


50 






He Ala 


Glu 


Phe 


65 






Ala Gin 


Leu 


Gin 


Gin Glu 


Leu 


Leu 






100 


Gin Glu 


Val 


Glu 




115 




Gly Lys 


Asp 


Arg 


130 






Gin Lys 


Leu 


Gin 


145 






Pro Thr 


Asn 


Gly 


Tyr Thr 


Ala 


Ala 






180 


Ser Gly 


Thr 


Ser 




195 




Ala Lys 


Asp 


Asp 


210 






Lys Val 


Arg 


Ser 


225 







He 


Ser 


Ser 


Val 


5 








He 


Ser 


Pro 


Leu 


Val 


Asp 


Pro 


Val 








40 


He 


Gin 


Gin 


Gin 






55 




Gin 


Lys 


Gin 


His 




70 






Glu 


His 


He 


Lys 


85 








Glu 


Lys 


Glu 


Gin 


Arg 


His 


Arg 


Arg 








120 


Gly 


Arg 


Glu 


Arg 






135 




Glu 


Phe 


Leu 


Leu 




150 






Lys 


Asn 


His 


Ser 


165 








His 


His 


Thr 


Ser 


Pro 


Ser 


Tyr 


Lys 








200 


Phe 


Pro 


Leu 


Arg 






215 




Arg 


Leu 


Lys 


Gin 



230 



Asp 


Val 


Lys 


Ser 




10 






Asp 


Leu 


Arg 


Thr 


25 








Val 


Arg 


Glu 


Lys 


Gin 


Gin 


He 


Gin 








60 


Glu 


Asn 


Leu 


Thr 






75 




Glu 


Leu 


Leu 


Ala 




90 






Lys 


Leu 


Glu 


Gin 


105 








Glu 


Gin 


Gin 


Leu 


Ala 


Val 


Ala 


Ser 








140 


Ser 


Lys 


Ser Ala 






155 




Val 


Ser 


Arg 


His 




170 






Leu 


Asp 


Gin 


Ser 


185 








Tyr 


Thr 


Leu 


Pro 


Lys 


Thr 


Ala 


Ser 








220 


Lys 


Val 


Ala 


Glu 






235 





Glu 


Val 


Pro 


Val 






15 




Asp 


Leu 


Arg 


Met 




30 






Gin 


Leu 


Gin 


Gin 


45 








Lys 


Gin 


Leu 


Leu 


Arg 


Gin 


His 


Gin 








80 


He 


Lys 


Gin 


Gin 






95 




Gin 


Arg 


Gin 


Glu 




110 






Pro 


Pro 


Leu 


Arg 


125 








Thr 


Glu 


Val 


Lys 


Thr 


Lys 


Asp 


Thr 








160 


Pro 


Lys 


Leu 


Trp 






175 




Ser 


Pro 


Pro 


Leu 




190 






Gly 


Ala 


Gin 


Asp 


205 








Glu 


Pro 


Asn 


Leu 


Arg 


Arg 


Ser 


Ser 



240 
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Pro 


Leu 


Leu 


Arg 


Arg 


Lys 










245 




Arg 


Met 


Phe 


Glu 


Val 


Thr 








260 






Ser 


Gly 


Pro 


Ser 


Ser 


Pro 






275 








Asn 


Glu 


Thr 


Ser 


Val 


Leu 




290 










Ser 


Gin 


Gin 


Arg 


He 


Leu 


305 










310 


Leu 


Tyr 


Thr 


Ser 


Pro 


Ser 










325 




Val 


Pro 


Ser 


Gin 


Leu 


Asn 








340 






Cys 


Glu 


Thr 


Gin 


Thr 


Leu 






355 








Gly 


Gly 


Ser 


lie 


Pro 


Ala 




370 










Gly 


Lys 


Pro 


Pro 


Asn 


Ser 


385 










390 


Leu 


Lys 


Glu 


Gin 


Met 


Arg 










405 




Pro 


Leu 


His 


Pro 


Gin 


Ser 








420 






Gly 


lie 


Arg 


Gly Thr 


His 






435 








Thr 


Gin 


Ser 


Ala 


Pro 


Leu 




450 










Gin 


Gin 


Gin 


His 


Gin 


Gin 


465 










470 


Gin 


He 


His 


Met 


Asn 


Lys 










485 




Gin 


Pro 


Gly 


Ser 


His 


Leu 








500 






Gin 


Ala 


Met 


Gin 


Glu 


Asp 






515 








Ser 


Asp 


Ser 


Ser 


Ala 


Cys 




530 










Val 


Lys 


Val 


Lys 


Glu 


Glu 


545 










550 


Gin 


Glu 


Met 


Glu 


Ser 


Gly 










565 




Leu 


Glu 


Pro 


Thr 


His 


Thr 








580 






Ala 


Ala 


Val 


Gly Met 


Asp 






595 








Thr 


His 


Ser 


Ser 


Pro 


Ala 




610 










Arg 


Pro 


Leu 


Gin 


Pro 


Gly 


625 










630 


Met 


Leu 


Lys 


His 


Gin 


Cys 










645 




His 


Ala 


Gly 


Arg 


He 


Gin 








660 






Leu 


Leu 


Asn 


Lys 


Cys 


Glu 






675 








Glu 


He 


Gin 


Leu 


Val 


His 




690 










Asn 


Pro 


Leu 


Asp Gly 


Gin 


705 










710 


Asp 


Ser 


Gin 


Lys 


Phe 


Phe 








725 





3/25 



Asp Gly Asn 


Val 


Val 


Thr 




250 






Glu Ser Ser 


Val 


Ser 


Ser 


265 








Asn Asn Gly 


Pro 


Thr 


Gly 


280 








Pro Pro Thr 


Pro 


His 


Ala 


295 






300 


He His Glu 


Asp 


Ser 


Met 






315 




Leu Pro Asn 


He 


Thr 


Leu 




330 






Ala Ser Asn 


Ser 


Leu 


Lys 


345 








Arg Gin Gly 


Val 


Pro 


Leu 


360 








Ser Ser Ser 


His 


Pro 


His 


375 






380 


Ser His Gin 


Ala 


Leu 


Leu 






395 




Gin Gin Lys 


Leu 


Leu 


Val 




410 






Pro Leu Ala 


Thr 


Lys 


Glu 


425 








Lys Leu Pro 


Arg 


His 


Arg 


440 








Pro Gin Ser 


Thr 


Leu 


Ala 


455 






460 


Phe Leu Glu 


Lys 


Gin 


Lys 






475 




Leu Leu Ser 


Lys 


Ser 


He 




490 






Glu Glu Ala. 


Glu 


Glu 


Glu 


505 








Arg Ala Pro 


Ser 


Ser 


Gly 


520 








Val Asp Asp 


Thr 


Leu 


Gly 


535 






540 


Pro Val Asp 


Ser 


Asp 


Glu 






555 




Glu Gin Ala 


Ala 


Phe 


Met 




570 






Arg Ala Leu 


Ser 


Val 


Arg 


585 








Gly Leu Glu 


Lys 


His 


Arg 


600 








Ala Ser Val 


Leu 


Pro 


His 


615 






ton 


Ser Ala Thr 


Gly 


He 


Ala 






DOj 




Val Cys Gly 


Asn 


Ser 


Thr 




650 






Ser He Trp 


Ser 


Arg 


Leu 


665 








Arg lie Gin 


Gly 


Arg 


Lys 


680 








Ser Glu His 


His 


Ser 


Leu 


695 






700 


Lys Leu Asp 


Pro 


Arg 


He 






715 




Ser Ser Leu 


Pro 


Cys 


Gly 



730 
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Ser 


Phe 


Lys 


Lys 






255 




Ser 


Ser 


Pro 


Gly 




270 






Ser 


Val 


Thr 


Glu 


285 








Glu 


Gin 


Met 


Val 


Asn 


Leu 


Leu 


Ser 








320 


Gly 


Leu 


Pro 


Ala 






335 




Glu 


Lys 


Gin 


Lys 




350 






Pro 


Gly 


Gin 


Tyr 


365 








Val 


Thr 


Leu 


Glu 


Gin 


His 


Leu 


Leu 








400 


Ala 


Gly 


Gly 


Val 






415 




Arg 


He 


Ser 


Pro 




430 






Pro 


Leu 


Asn 


Arg 


445 








Gin 


Leu 


Val 


He 


Gin 


Tyr 


Gin 


Gin 








480 


Glu 


Gin 


Leu 


Lys 






495 




Leu 


Gin 


Gly 


Asp 




510 






Asn 


Ser 


Thr 


Arg 


525 








Gin 


Val 


Gly 


Ala 


Asp 


Ala 


Gin 


He 








560 


Gin 


Gin 


Pro 


Phe 






575 




Gin 


Ala 


Pro 


Leu 




590 






Leu 


Val 


Ser 


Arg 


605 








Pro 


Ala 


Met 


Asp 


Tyr 


Asp 


Pro 


Leu 








640 


Thr 


His 


Pro 


Glu 






655 




Gin 


Glu 


Thr 


Gly 




670 






Ala 


Ser 


Leu 


Glu 


685 








Leu 


Tyr 


Gly 


Thr 


Leu 


Leu 


Gly 


Asp 








720 


Gly 


Leu 


Gly 


Val 






735 
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Asp Ser 


Asp 


Thr 


lie Trp Asn Glu Leu His 


Ser 


Ser 


Gly 


Ala 


Ala 


Arg 






740 


745 








750 






Met Ala 


Val 


Gly 


Cys Val lie Glu Leu Ala 


Ser 


Lys 


Val 


Ala 


Ser 


Gly 




755 




760 






765 








Glu Leu 


Lys 


Asn 


Gly Phe Ala Val Val Arg 


Pro 


Pro 


Gly 


His 


His 


Ala 


770 






775 




780 










Glu Glu 


Ser 


Thr 


Ala Met Gly Phe Cys Phe 


Phe 


Asn 


Ser 


Val 


Ala 


He 


785 






790 


795 










800 


Thr Ala 


Lys 


Tyr 


Leu Arg Asp Gin Leu Asn 


He 


Ser 


Lys 


He 


Leu 


He 








805 810 










815 




Val Asp 


Leu 


Asp 


Val His His Gly Asn Gly 


Thr 


Gin 


Gin 


Ala 


Phe 


Tyr 






820 


825 








830 






Ala Asp 


Pro 


Ser 


He Leu Tyr He Ser Leu 


His 


Arg 


Tyr 


Asp 


Glu 


Gly 




835 




840 






845 








Asn Phe 


Phe 


Pro 


Gly Ser Gly Ala Pro Asn 


Glu 


Val 


Gly 


Thr 


Gly 


Leu 


850 






855 




860 










Gly Glu 


Gly 


Tyr 


Asn He Asn He Ala Trp 


Thr 


Gly 


Gly 


Leu 


Asp 


Pro 


865 






870 


875 










880 


Pro Met 


Gly 


Asp 


Val Glu Tyr Leu Glu Ala 


Phe 


Arg 


Thr 


He 


Val 


Lys 








885 890 










895 




Pro Val 


Ala 


Lys 


Glu Phe Asp Pro Asp Met 


Val 


Leu 


Val 


Ser 


Ala 


Gly 






900 


905 








910 






Phe Asp 


Ala 


Leu 


Glu Gly His Thr Pro Pro 


Leu 


Gly 


Gly 


Tyr 


Lys 


Val 




915 




920 






925 








Thr Ala 


Lys 


Cys 


Phe Gly His Leu Thr Lys 


Gin 


Leu 


Met 


Thr 


Leu 


Ala 


930 






935 




940 










Asp Gly 


Arg 


Val 


Val Leu Ala Leu Glu Gly 


Gly 


His 


Asp 


Leu 


Thr 


Ala 


945 






950 


955 










960 


lie Cys 


Asp 


Ala 


Ser Glu Ala Cys Val Asn 


Ala 


Leu 


Leu 


Gly 


Asn 


Glu 








965 970 










975 




Leu Glu 


Pro 


Leu 


Ala Glu Asp He Leu His 


Gin 


Ser 


Pro 


Asn 


Met 


Asn 






980 


985 








990 






Ala Val 


lie 


Ser 


Leu Gin Lys He He Glu 


He 


Gin 


Ser 


Met 


Ser 


Leu 




995 




1000 






1005 






Lys Phe 


Ser 



















1010 



<210> 3 

<211> 3499 

<212> DNA 

<213> Homo sapiens 

<400> 3 

ggggaagaga ggcacagaca cagataggag aagggcaccg gctggagcca cttgcaggac 60 
tgagggtttt tgcaacaaaa ccctagcagc ctgaagaact ctaagccaga tggggtggct 120 
ggacgagagc agctcttggc tcagcaaaga atgcacagta tgatcagctc agtggatgtg 180 
aagtcagaag ttcctgtggg cctggagccc atctcacctt tagacctaag gacagacctc 240 
aggatgatga tgcccgtggt ggaccctgtt gtccgtgaga agcaattgca gcaggaatta 300 
cttcttatcc agcagcagca acaaatccag aagcagcttc tgatagcaga gtttcagaaa 360 
cagcatgaga acttgacacg gcagcaccag gctcagcttc aggagcatat caaggaactt 420 
ctagccataa aacagcaaca agaactccta gaaaaggagc agaaactgga gcagcagagg 480 
caagaacagg aagtagagag gcatcgcaga gaacagcagc ttcctcctct cagaggcaaa 540 
gatagaggac gagaaagggc agtggcaagt acagaagtaa agcagaagct tcaagagttc 600 
ctactgagta aatcagcaac gaaagacact ccaactaatg gaaaaaatca ttccgtgagc 660 
cgccatccca agctctggta cacggctgcc caccacacat cattggatca aagctctcca 720 
ccccttagtg gaacatctcc atcctacaag tacacattac caggagcaca agatgcaaag 780 
gatgatttcc cccttcgaaa aactgcctct gagcccaact tgaaggtgcg gtccaggtta 840 
aaacagaaag tggcagagag gagaagcagc cccttactca ggcggaagga tggaaatgtt 900 
gtcacttcat tcaagaagcg aatgtttgag gtgacagaat cctcagtcag tagcagttct 960 
ccaggctctg gtcccagttc accaaacaat gggccaactg gaagtgttac tgaaaatgag 1020 
acttcggttt tgccccctac ccctcatgcc gagcaaatgg tttcacagca acgcattcta 1080 
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attcatgaag attccatgaa cctgctaagt 
accttggggc ttcccgcagt gccatcccag 
cagaagtgtg agacgcagac gcttaggcaa 
agcatcccgg catcttccag ccaccctcat 
agccaccagg ctctcctgca gcatttatta 
cttgtagctg gtggagttcc cttacatcct 
tcacctggca ttagaggtac ccacaaattg 
tctgcacctt tgcctcagag cacgttggct 
ttcttggaga agcagaagca ataccagcag 
tctattgaac aactgaagca accaggcagt 
ggggaccagg cgatgcagga agacagagcg 
agcagtgctt gtgtggatga cacactggga 
ccagtggaca gtgatgaaga tgctcagatc 
tttatgcaac agcctttcct ggaacccacg 
ccgctggctg cggttggcat ggatggatta 
tcttcccctg ctgcctctgt tttacctcac 
tctgcaactg gaattgccta tgaccccttg 
tccaccaccc accctgagca tgctggacga 
actgggctgc taaataaatg tgagcgaatt 
cagcttgttc attctgaaca tcactcactg 
aagctggacc ccaggatact cctaggtgat 
tgtggtggac ttggggtgga cagtgacacc 
gcacgcatgg ctgttggctg tgtcatcgag 
aagaatgggt ttgctgttgt gaggccccct 
gggttctgct tttttaattc agttgcaatt 
ataagcaaga tattgattgt agatctggat 
ttttatgctg accccagcat cctgtacatt 
ttccctggca gtggagcccc aaatgaggtt 
ttgtatcttt caggtaattg cattgcatga 
gttttaaatt acacgagatt actgaattgt 
gtgcataacc cagagcactg tttgtcaggg 
tgtttatttc aagagctccc atgtgcttgt 
tctcttctct gcccaccgtg gtgtgtcttt 
agggtacaat ataaatattg cctggacagg 
gtaccttgaa gcattcagga ccatcgtgaa 
ggtcttagta tctgctggat ttgatgcatt 
caaagtgacg gcaaaatgtt ttggtcattt 
acgtgtggtg ttggctctag aaggaggaca 
agcctgtgta aatgcccttc taggaaatga 
ccaaagcccg aatatgaatg ctgttatttc 
gtctttaaag ttctcttaa 



ctttatacct ctccttcttt gcccaacatt 1140 
ctcaatgctt cgaattcact caaagaaaag 1200 
ggtgttcctc tgcctgggca gtatggaggc 1260 
gttactttag agggaaagcc acccaacagc 1320 
ttgaaagaac aaatgcgaca gcaaaagctt 1380 
cagtctccct tggcaacaaa agagagaatt 1440 
ccccgtcaca gacccctgaa ccgaacccag 1500 
cagctggtca ttcaacagca acaccagcaa 1560 
cagatccaca tgaacaaact gctttcgaaa 1620 
caccttgagg aagcagagga agagcttcag 1680 
ccctctagtg gcaacagcac taggagcgac 1740 
caagttgggg ctgtgaaggt caaggaggaa 1800 
caggaaatgg aatctgggga gcaggctgct 1860 
cacacacgtg cgctctctgt gcgccaagct 1920 
gagaaacacc gtctcgtctc caggactcac 1980 
ccagcaatgg accgccccct ccagcctggc 2040 
atgctgaaac accagtgcgt ttgtggcaat 2100 
atacagagta tctggtcacg actgcaagaa 2160 
caaggtcgaa aagccagcct ggaggaaata 2220 
ttgtatggca ccaaccccct ggacggacag 2280 
gactctcaaa agtttttttc ctcattacct 2340 
atttggaatg agctacactc gtccggtgct 2400 
ctggcttcca aagtggcctc aggagagctg 2460 
ggccatcacg ctgaagaatc cacagccatg 2520 
accgccaaat acttgagaga ccaactaaat 2580 
gttcaccatg gaaacggtac ccagcaggcc 2640 
tcactccatc gctatgatga agggaacttt 2700 
cggtttattt ctttagagcc ccacttttat 2760 
ttacccctaa ttttcttgtc ctttgctggt 2820 
cccatgggac caagaaccag tgcagaacaa 2880 
aaggttgggc tgatttgatg tgttgtttga 2940 
tttcctctct tcttgctttc ttccatttgc 3000 
ctcttcccag gttggaacag gccttggaga 3060 
tggccttgat cctcccatgg gagatgttga 3120 
gcctgtggcc aaagagtttg atccagacat 3180 
ggaaggccac acccctcctc taggagggta 3240 
gacgaagcaa ttgatgacat tggctgatgg 3300 
tgatctcaca gccatctgtg atgcatcaga 3360 
gctggagcca cttgcagaag atattctcca 3420 
tttacagaag atcattgaaa ttcaaagtat 3480 

3499 



<210> 4 
<211> 879 
<212> PRT 

<213> Homo sapiens 



<400> 4 



Met 


His 


Ser 


Met 


He 


Ser 


Ser 


Val 


Asp 


Val 


Lys 


Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 










15 




Gly 


Leu 


Glu 


Pro 


He 


Ser 


Pro 


Leu 


Asp 


Leu 


Arg 


Thr 


Asp 


Leu 


Arg 


Met 








20 










25 










30 






Met 


Met 


Pro 


Val 


Val 


Asp 


Pro 


Val 


Val 


Arg 


Glu 


Lys 


Gin 


Leu 


Gin 


Gin 






35 










40 










45 








Glu 


Leu 


Leu 


Leu 


He 


Gin 


Gin 


Gin 


Gin 


Gin 


He 


Gin 


Lys 


Gin 


Leu 


Leu 




50 










55 










60 










He 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu 


Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 










80 


Ala 


Gin 


Leu 


Gin 


Glu 


His 


He 


Lys 


Glu 


Leu 


Leu 


Ala 


He 


Lys 


Gin 


Gin 










85 










90 










95 




Gin 


Glu 


Leu 


Leu 


Glu 


Lys 


Glu 


Gin 


Lys 


Leu 


Glu 


Gin 


Gin 


Arg 


Gin 


Glu 



100 105 HO 
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Gin 


Glu 


Val 
115 


Glu 


Arg His 


Gly 


Lys 

1 JU 


Asp 


Arg 


Gly Arg 


Gin 


Lys 


Leu 


Gin 


Glu Phe 


145 








150 


Pro 


Thr 


Asn 


Gly 


Lys Asn 
165 


Tyr 


Thr 


Ala 


Ala 
180 


His His 


Ser 


Gly 


Thr 
195 


Ser 


Pro Ser 


Ala 


Lys 

210 


Asp 


Asp 


Phe Pro 


Lys 


Val 


Arg 


Ser 


Arg Leu 


225 








230 


Pro 


Leu 


Leu 


Arg 


Arg Lys 
245 


Arg 


Met 


Phe 


Glu 
260 


Val Thr 


Ser 


Gly 


Pro 
275 


Ser 


Ser Pro 


Asn 


Glu 
290 


Thr 


Ser 


Val Leu 


Ser 


Gin 


Gin 


Arg 


lie Leu 


305 








310 


Leu 


Tyr 


Thr 


Ser 


Pro Ser 
325 


Val 


Pro 


Ser 


Gin 
340 


Leu Asn 


Cys 


Glu 


Thr 
355 


Gin 


Thr Leu 


Gly 


Gly 
370 


Ser 


lie 


Pro Ala 


Gly 


Lys 


Pro 


Pro 


Asn Ser 


385 








390 


Leu 


Lys 


Glu 


Gin 


Met Arg 
405 


Pro 


Leu 


His 


Pro 
420 


Gin Ser 


Gly 


lie 


Arg 
435 


Gly 


Thr His 


Thr 


Gin 
450 


Ser 


Ala 


Pro Leu 


Gin 


Gin 


Gin 


His 


Gin Gin 


465 








470 


Gin 


lie 


His 


Met 


Asn Lys 
485 


Gin 


Pro 


Gly 


Ser 
500 


His Leu 


bin 


AX a 


Mini- 

Met 
515 


bin 


Glu Asp 


Ser 


Asp 
530 


Ser 


Ser 


Ala Cys 


Val 


Lys 


Val 


Lys 


Glu Glu 


545 








550 


Gin 


Glu 


Met 


Glu 


Ser Gly 
565 


Leu 


Glu 


Pro 


Thr 
580 


His Thr 


Ala 


Ala 


Val 
595 


Gly 


Met Asp 
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Arg Arg Glu Gin 


bin 


Leu 


120 






Glu Arg Ala Val 


Ala 


Ser 


155 




14U 


Leu Leu Ser Lys 


Ser 


Ala 




Ibb 




His Ser Val ber 


Arg 


HIS 


170 






Thr Ser Leu Asp 


Gin 


Ser 


185 






Tyr Lys Tyr Thr 


Leu 


Pro 


200 






Leu Arg Lys Thr 


Ala 


Ser 


215 




220 


Lys Gin Lys Val 


Ala 


Glu 




235 




Asp Gly Asn Val 


Val 


Thr 


250 






Glu Ser Ser Val 


Ser 


Ser 


265 






Asn Asn Gly Pro 


Thr 


Gly 


280 






Pro Pro Thr Pro 


His 


Ala 


295 




300 


lie His Glu Asp 


Ser 


Met 




315 




Leu Pro Asn lie 


Thr 


Leu 


330 






Ala Ser Asn Ser 


Leu 


Lys 


345 






Arg Gin Gly Val 


Pro 


Leu 


360 






Ser Ser Ser His 


Pro 


His 


375 




380 


Ser His Gin Ala 


Leu 


Leu 




395 




Gin Gin Lys Leu 


Leu 


Val 


410 






Pro Leu Ala Thr 


Lys 


Glu 


425 






Lys Leu Pro Arg 


His 


Arg 


440 






Pro Gin Ser Thr 


Leu 


Ala 


455 




460 


Phe Leu Glu Lys 


Gin 


Lys 




475 




Leu Leu Ser Lys 


Ser 


lie 


a q n 

4y 0 






G1U blU Ala blU 


blU 


blU 


c n c 

jU J 






Arg Ala Pro Ser 


Ser 


Gly 


520 






Val Asp Asp Thr 


Leu 


Gly 


535 




540 


Pro Val Asp Ser 


Asp 


Glu 




555 




Glu Gin Ala Ala 


Phe 


Met 


570 






Arg Ala Leu Ser 


Val 


Arg 


585 






Gly Leu Glu Lys 


His 


Arg 



600 



PCT/US02/19051 



Pro 


Pro 


Leu 


Arg 


125 








Thr 


Glu 


Val 


Lys 


Thr 


Lys 


Asp 


Thr 








160 


Pro 


Lys 


Leu 


Trp 






175 




Ser 


Pro 


Pro 


Leu 




190 






Gly 


Ala 


Gin 


Asp 


205 








Glu 


Pro 


Asn 


Leu 


Arg 


Arg 


Ser 


Ser 








240 


Ser 


Phe 


Lys 


Lys 






255 




Ser 


Ser 


Pro 


Gly 




270 






Ser 


Val 


Thr 


Glu 


285 








Glu 


Gin 


Met 


Val 


Asn 


Leu 


Leu 


Ser 








320 


Gly 


Leu 


Pro 


Ala 






335 




Glu 


Lys 


Gin 


Lys 




350 


.. 




Pro 


Gly 


Gin 


Tyr 


365 








Val 


Thr 


Leu 


Glu 


Gin 


His 


Leu 


Leu 








400 


Ala 


Gly 


Gly 


Val 






415 




Arg 


lie 


Ser 


Pro 




430 






Pro 


Leu 


Asn 


Arg 


445 








Gin 


Leu 


Val 


He 


Gin 


Tyr 


Gin 


Gin 








480 


Glu 


Gin 


Leu 


Lys 






495 




Leu 


Gin 


Gly 


Asp 




510 






Asn 


Ser 


Thr 


Arg 


525 








Gin 


Val 


Gly 


Ala 


Asp 


Ala 


Gin 


He 








560 


Gin 


Gin 


Pro 


Phe 






575 




Gin 


Ala 


Pro 


Leu 




590 






Leu 


Val 


Ser 


Arg 


605 
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Thr 


His 


Ser 


Ser 


Pro Ala Ala Ser 




610 






615 


Arg 


Pro 


Leu 


Gin 


Pro Gly Ser Ala 


625 








630 


Met 


Leu 


Lys 


His 


Gin Cys Val Cys 










645 


His 


Ala 


Gly 


Arg 


He Gin Ser He 








660 




Leu 


Leu 


Asn 


Lys 


Cys Glu Arg He 






675 




680 


Glu 


lie 


Gin 


Leu 


Val His Ser Glu 




690 






695 


Asn 


Pro 


Leu 


Asp 


Gly Gin Lys Leu 


705 








710 


Asp 


Ser 


Gin 


Lys 


Phe Phe Ser Ser 










725 


Asp 


Ser 


Asp 


Thr 


He Trp Asn Glu 








740 




Met 


Ala 


Val 


Gly 


Cys Val He Glu 






755 




760 


Glu 


Leu 


Lys 


Asn 


Gly Phe Ala Val 




770 






775 


Glu 


Glu 


Ser 


Thr 


Ala Met Gly Phe 


785 








790 


Thr 


Ala 


Lys 


Tyr 


Leu Arg Asp Gin 










805 


Val 


Asp 


Leu 


Asp 


Val His His Gly 








820 




Ala 


Asp 


Pro 


Ser 


He Leu Tyr He 






835 




840 


Asn 


Phe 


Phe 


Pro 


Gly Ser Gly Ala 




850 






855 


Leu 


Glu 


Pro 


His 


Phe Tyr Leu Tyr 


865 








870 



Val Leu 


Pro 


His 


Pro 


Ala 


Met 


Asp 
















Thr Gly 


He 


Ala 


Tyr 


Asp 


Pro 


Leu 




c o c 












Gly Asn 


Ser 


Thr 


Thr 


His 


Pro 


Glu 


650 










655 




Trp Ser 


Arg 


Leu 


Gin 


Glu 


Thr 


Gly 


665 








670 






Gin Gly 


Arg 


Lys 


Ala 


Ser 


Leu 


Glu 








685 








His His 


Ser 


Leu 


Leu 


Tyr 


Gly 


Thr 






700 










Asp. Pro 


Arg 


He 


Leu 


Leu 


Gly 


Asp 




715 










720 


Leu Pro 


Cys 


Gly 


Gly 


Leu 


Gly 


Val 


730 










735 




Leu His 


Ser 


Ser 


Gly 


Ala 


Ala 


Arg 


745 








750 






Leu Ala 


Ser 


Lys 


Val 


Ala 


Ser 


Gly 








765 








Val Arg 


Pro 


Pro 


Gly 


His 


His 


Ala 






/ oU 










Cys Phe 


Phe 


Asn 


Ser 


Val 


Ala 


He 




795 










800 


Leu Asn 


He 


Ser 


Lys 


He 


Leu 


He 


810 










815 




Asn Gly 


Thr 


Gin 


Gin 


Ala 


Phe 


Tyr 


825 








830 






Ser Leu 


His 


Arg 


Tyr 


Asp 


Glu 


Gly 








845 








Pro Asn 


Glu 


Val 


Arg 


Phe 


He 


Ser 






860 










Leu Ser 


Gly 


Asn 


Cys 


He 


Ala 





875 



<210> 5 
<211> 3054 
<212> DMA 

<213> Homo sapiens 



<400> 5 

ggggaagaga 

tgagggtttt 

ggacgagagc 

aagtcagaag 

aggatgatga 

cttcttatcc 

cagcatgaga 

ctagccataa 

caagaacagg 

gatagaggac 

ctactgagta 

cgccatccca 

ccccttagtg 

gatgatttcc 

cccagttcac 

ccccctaccc 

tccatgaacc 

cccgcagtgc 

acgcagacgc 

tcttccagcc 



ggcacagaca 
tgcaacaaaa 
agctcttggc 
ttcctgtggg 
tgcccgtggt 
agcagcagca 
acttgacacg 
aacagcaaca 
aagtagagag 
gagaaagggc 
aatcagcaac 
agctctggta 
gaacatctcc 
cccttcgaaa 
caaacaatgg 
ctcatgccga 
tgctaagtct 
catcccagct 
ttaggcaagg 
accctcatgt 



cagataggag 
ccctagcagc 
tcagcaaaga 
cctggagccc 
ggaccctgtt 
acaaatccag 
gcagcaccag 
agaactccta 
gcatcgcaga 
agtggcaagt 
gaaagacact 
cacggctgcc 
atcctacaag 
aactgaatcc 
gccaactgga 
gcaaatggtt 
ttatacctct 
caatgcttcg 
tgttcctctg 
tactttagag 



aagggcaccg 
ctgaagaact 
atgcacagta 
atctcacctt 
gtccgtgaga 
aagcagcttc 
gctcagcttc 
gaaaaggagc 
gaacagcagc 
acagaagtaa 
ccaactaatg 
caccacacat 
tacacattac 
tcagtcagta 
agtgttactg 
tcacagcaac 
ccttctttgc 
aattcactca 
cctgggcagt 
ggaaagccac 



gctggagcca 
ctaagccaga 
tgatcagctc 
tagacctaag 
agcaattgca 
tgratagcaga 
aggagcatat 
agaaactgga 
ttcctcctct 
agcagaagct 
gaaaaaatca 
cattggatca 
caggagcaca 
gcagttctcc 
aaaatgagac 
gcattctaat 
ccaacattac 
aagaaaagca 
atggaggcag 
ccaacagcag 



cttgcaggac 
tggggtggct 
agtggatgtg 
gacagacctc 
gcaggaatta 
gtttcagaaa 
caaggaactt 
gcagcagagg 
cagaggcaaa 
tcaagagttc 
ttccgtgagc 
aagctctcca 
agatgcaaag 
aggctctggt 
ttcggttttg 
tcatgaagat 
cttggggctt 
gaagtgtgag 
catcccggca 
ccaccaggct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 
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ctcctgcagc atttattatt gaaagaacaa atgcgacagc aaaagcttct tgtagctggt 1260 
ggagttccct tacatcctca gtctcccttg gcaacaaaag agagaatttc acctggcatt 1320 
agaggtaccc acaaattgcc ccgtcacaga cccctgaacc gaacccagtc tgcacctttg 1380 
cctcagagca cgttggctca gctggtcatt caacagcaac accagcaatt cttggagaag 1440 
cagaagcaat accagcagca gatccacatg aacaaactgc tttcgaaatc tattgaacaa 1500 
ctgaagcaac caggcagtca ccttgaggaa gcagaggaag agcttcaggg ggaccaggcg 1560 
atgcaggaag acagagcgcc ctctagtggc aacagcacta ggagcgacag cagtgcttgt 1620 
gtggatgaca cactgggaca agttggggct gtgaaggtca aggaggaacc agtggacagt 1680 
gatgaagatg ctcagatcca ggaaatggaa tctggggagc aggctgcttt tatgcaacag 1740 
cctttcctgg aacccacgca cacacgtgcg ctctctgtgc gccaagctcc gctggctgcg 1800 
gttggcatgg atggattaga gaaacaccgt ctcgtctcca ggactcactc ttcccctgct 1860 
gcctctgttt tacctcaccc agcaatggac cgccccctcc agcctggctc tgcaactgga 1920 
attgcctatg accccttgat gctgaaacac cagtgcgttt gtggcaattc caccacccac 1980 
cctgagcatg ctggacgaat acagagtatc tggtcacgac tgcaagaaac tgggctgcta 2 040 
aataaatgtg agcgaattca aggtcgaaaa gccagcctgg aggaaataca gcttgttcat 2100 
tctgaacatc actcactgtt gtatggcacc aaccccctgg acggacagaa gctggacccc 2160 
aggatactcc taggtgatga ctctcaaaag tttttttcct cattaccttg tggtggactt 2220 
ggggtggaca gtgacaccat ttggaatgag ctacactcgt ccggtgctgc acgcatggct 2280 
gttggctgtg tcatcgagct ggcttccaaa gtggcctcag gagagctgaa gaatgggttt 2340 
gctgttgtga ggccccctgg ccatcacgct gaagaatcca cagccatggg gttctgcttt 2400 
tttaattcag ttgcaattac cgccaaatac ttgagagacc aactaaatat aagcaagata 2460 
ttgattgtag atctggatgt tcaccatgga aacggtaccc agcaggcctt ttatgctgac 2520 
cccagcatcc tgtacatttc actccatcgc tatgatgaag ggaacttttt ccctggcagt 2580 
ggagccccaa atgaggttgg aacaggcctt ggagaagggt acaatataaa tattgcctgg 2640 
acaggtggcc ttgatcctcc catgggagat gttgagtacc ttgaagcatt caggaccatc 2700 
gtgaagcctg tggccaaaga gtttgatcca gacatggtct tagtatctgc tggatttgat 2760 
gcattggaag gccacacccc tcctctagga gggtacaaag tgacggcaaa atgttttggt 2820 
catttgacga agcaattgat gacattggct gatggacgtg tggtgttggc tctagaagga 2880 
ggacatgatc tcacagccat ctgtgatgca tcagaagcct gtgtaaatgc ccttctagga 2940 
aatgagctgg agccacttgc agaagatatt ctccaccaaa gcccgaatat gaatgctgtt 3000 
atttctttac agaagatcat tgaaattcaa agtatgtctt taaagttctc ttaa 3054 

<210> 6 
<211> 967 
<212> PRT 

<213> Homo sapiens 
<400> 6 



Met 


His 


Ser 


Met 


He 


Ser 


Ser 


Val 


Asp 


Val 


Lys 


Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 
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Gly 


Leu 


Glu 


Pro 
20 


He 


Ser 


Pro 


Leu 


Asp 
25 


Leu 


Arg 


Thr 


Asp 


Leu 
30 


Arg 


Met 


Met 


Met 


Pro 
35 


Val 


Val 


Asp 


Pro 


Val 
40 


Val 


Arg 


Glu 


Lys 


Gin 
45 


Leu 


Gin 


Gin 


Glu 


Leu 
50 


Leu 


Leu 


He 


Gin 


Gin 
55 


Gin 


Gin 


Gin 


He 


Gin 
60 


Lys 


Gin 


Leu 


Leu 


He 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu 


Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 










80 


Ala 


Gin 


Leu 


Gin 


Glu 
85 


His 


He 


Lys 


Glu 


Leu 
90 


Leu 


Ala 


He 


Lys 


Gin 
95 


Gin 


Gin 


Glu 


Leu 


Leu 
100 


Glu 


Lys 


Glu 


Gin 


Lys 
105 


Leu 


Glu 


Gin 


Gin 


Arg 
110 


Gin 


Glu 


Gin 


Glu 


Val 
115 


Glu 


Arg 


His 


Arg 


Arg 
120 


Glu 


Gin 


Gin 


Leu 


Pro 
125 


Pro 


Leu 


Arg 


Gly 


Lys 
130 


Asp 


Arg 


Gly 


Arg 


Glu 
135 


Arg 


Ala 


Val 


Ala 


Ser 
140 


Thr 


Glu 


Val 


Lys 


Gin 


Lys 


Leu 


Gin 


Glu 


Phe 


Leu 


Leu 


Ser 


Lys 


Ser 


Ala 


Thr 


Lys 


Asp 


Thr 


145 










150 










155 










160 


Pro 


Thr 


Asn 


Gly 


Lys 
165 


Asn 


His 


Ser 


Val 


Ser 
170 


Arg 


His 


Pro 


Lys 


Leu 
175 


Trp 


Tyr 


Thr 


Ala 


Ala 


His 


His 


Thr 


Ser 


Leu 


Asp 


Gin 


Ser 


Ser 


Pro 


Pro 


Leu 
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Ser 


Gly 


Thr 
195 


ber 


Pro 


Ser 


Tyr 


Lys 
200 


ax a 


Lys 
01 a 


Asp 


Asp 


pne 


Pro 


Leu 


Arg 


Ser 


Ser 


Pro 


Gly 


Ser 


Gly 


Pro 


Ser 


225 










230 






Ser 


Val 


Thr 


Glu 


Asn 
245 


Glu 


Thr 


Ser 




Gin 


Met 


Val 
260 


Ser 


Gin 


Gin 


Arg 


Asn 


Leu 


Leu 
275 


Ser 


Leu 


Tyr 


Thr 


Ser 
280 


Gly 


Leu 

i a a 


Pro 


Ala 


Val 


Pro 


Ser 

one 


Gin 


Glu 


Lys 


Gin 


Lys 


Cys 


Glu 


Thr 


Gin 


305 










310 






Pro 


Gly 


Gin 


Tyr 


Gly 
325 


Gly 


Ser 


He 


Val 


Thr 


Leu 


Glu 
340 


Gly 


Lys 


Pro 


Pro 


Gin 


His 


Leu 
355 


Leu 


Leu 


Lys 


Glu 


Gin 
360 


Ala 


Gly 

OTA 

37 0 


Gly 


Val 


Pro 


Leu 


His 
375 


Pro 


Arg 


lie 


Ser 


Pro 


Gly 


lie 


Arg 


Gly 


385 










390 






Pro 


Leu 


Asn 


Arg 


Thr 
405 


Gin 


Ser 


Ala 


Gin 


Leu 


Val 


lie 
420 


Gin 


Gin 


Gin 


His 


Gin 


Tyr 


Gin 

435 


Gin 


Gin 


lie 


His 


Met 
440 


Glu 


Gin 

A C A 

450 


Leu 


Lys 


Gin 


Pro 


Gly 

/I r r 

455 


Ser 


Leu 


Gin 


Gly 


Asp 


Gin 


Ala 


Met 


Gin 


465 










470 






Asn 


Ser 


Thr 


Arg 


Ser 
485 


Asp 


Ser 


Ser 


Gin 


Val 


Gly 


TV "I _ 

Ala 
500 


Val 


Lys 


Val 


Lys 


Asp 


Ala 


Gin 
515 


lie 


Gin 


Glu 


Met 


Glu 
520 


vain 


bin 

jjU 


Pro 


pne 


Leu 


G1U 


Pro 


Thr 


bin 


Ala 


Pro 


Leu 


Ala 


Ala 


vai 


Gly 


545 










550 






Leu 


Val 


Ser 


Arg 


rnV, -y- 

inr 
565 


rllS 


Ser 


Ser 


Pro 


Aia 




Asp 
580 


Arg 


Pro 


Leu 


Gin 


j-yr 


Ai3p 


c L. \J 

595 






Lsu 


Lys 


flJLS 

600 


Thr 


His 
610 


Pro 


Glu 


His 


Ala 


Gly 
615 


Arg 


Gin 


Glu 


Thr 


Gly 


Leu 


Leu 


Asn 


Lys 


625 










630 






Ala 


Ser 


Leu 


Glu 


Glu 
645 


He 


Gin 


Leu 


Leu 


Tyr 


Gly 


Thr 
660 


Asn 


Pro 


Leu 


Asp 


Leu 


Leu 


Gly 
675 


Asp 


Asp 


Ser 


Gin 


Lys 
680 



Tyr 


inr 


Leu Pro Gly Ala 


Gin Asp 






205 








Lys 


inr 


Glu Ser Ser 


Val 


Ser 


Ser 






220 








Ser 


Pro 


Asn Asn Gly Pro 


Thr Gly 






235 






240 


Val 


Leu 


Pro Pro Thr 


Pro 


His 


Ala 




OCA 






255 




lie 


Leu 


He His Glu 


Asp 


Ser 


Met 








270 






Pro 


Ser 


Leu Pro Asn 


He 


Thr 


Leu 






285 








Leu 


Asn 


Ala Ser Asn 


Ser 


Leu 


Lys 






300 








Thr 


Leu 


Arg Gin Gly Val 


Pro 


Leu 






315 






320 


Pro 


Tl 1 _ 

Ala 


Ser Ser Ser 


His 


Pro 


His 




ion 
330 






335 




Asn 


Ser 


Ser His Gin 


Ala 


Leu 


Leu 


1 A C 






350 






Met 


Arg 


Gin Gin Lys 


Leu 


Leu Val 






365 








Gin 


Ser 


Pro Leu Ala 


Thr 


Lys 


Glu 






380 








Thr 


His 


Lys Leu Pro 


Arg 


His 


Arg 






395 






400 


Pro 


Leu 


Pro Gin Ser Thr 


Leu 


Ala 




A 1 A 
41U 






415 




Gin 


Gin 


Phe Leu Glu Lys 


Gin Lys 


425 






430 






Asn 


Lys 


Leu Leu Ser 


Lys 


Ser 


He 






445 








HIS 


Leu 


Glu Glu Ala 


Glu 


Glu 


Glu 






460 








G1U 


Asp 


Arg Ala Pro 


Ser 


Ser Gly 






475 






480 


Ala 


Cys 


Val Asp Asp 


Thr 


Leu Gly 




A Q A 






495 




Glu 


Glu 


Pro Val Asp 


Ser 


Asp 


Glu 


E A C 






510 






Ser 


Gly 


Glu Gin Ala 


Ala 


Phe 


Met 






525 








rilS 


Thr 


Arg Ala Leu 


Ser 


Val 


Arg 






540 








jxieu 


Asp 


Gly Leu Glu Lys 


His 


Arg 






555 






560 


Pro 


7V 1 a 


Ala Ser Val 


Leu 


Pro 


His 




D I \J 






575 




Pro 


vj±y 


Ser Ala Thr 


Gly 


He 


Ala 


585 






590 






Gin 


Cys 


Val Cys Gly Asn 


Ser 


Thr 






605 








He 


Gin 


Ser lie' Trp 


Ser 


Arg Leu 






620 








Cys 


Glu 


Arg He Gin Gly 


Arg Lys 






635 






640 


Val 


His 


Ser Glu His 


His 


Ser 


Leu 




650 






655 




Gly 


Gin 


Lys Leu Asp 


Pro 


Arg 


He 


665 






670 






Phe 


Phe 


Ser Ser Leu 


Pro 


Cys 


Gly 



685 
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Gly 


Leu 


Gly 


Val Asp 


Ser Asp Thr 




690 








695 


Gly 


Ala 


Ala 


Arg Met 


Ala 


Val Gly 


705 








710 




Val 


Ala 


Ser 


Gly Glu 


Leu 


Lys Asn 








725 






Gly 


His 


His 


Ala Glu 


Glu 


Ser Thr 








740 






Ser 


Val 


Ala 


lie Thr 


Ala 


Lys Tyr 






755 






760 


Lys 


lie 


Leu 


lie Val 


Asp 


Leu Asp 




770 








775 


Gin 


Ala 


Phe 


Tyr Ala 


Asp 


Pro Ser 


785 








790 




Tyr 


Asp 


Glu 


Gly Asn 


Phe 


Phe Pro 








805 






Gly 


Thr 


Gly 


Leu Gly 


Glu Gly Tyr 








820 






Gly 


Leu 


Asp 


Pro Pro 


Met Gly Asp 






835 






840 


Thr 


lie 


Val 


Lys Pro 


Val 


Ala Lys 




850 








855 


Val 


Ser 


Ala 


Gly Phe 


Asp Ala Leu 


865 








870 




Gly 


Tyr 


Lys 


Val Thr 


Ala 


Lys Cys 








885 






Met 


Thr 


Leu 


Ala Asp 


Gly Arg Val 








900 






Asp 


Leu 


Thr 


Ala He 


Cys Asp Ala 






915 






920 


Leu 


Gly 


Asn 


Glu Leu 


Glu 


Pro Leu 




930 








935 


Pro 


Asn 


Met 


Asn Ala 


Val 


He Ser 


945 








950 




Ser 


Met 


Ser 


Leu Lys 


Phe 


Ser 








965 







He 


Trp Asn Glu Leu 


His 


Ser 


Ser 




700 








Cys 


Val He Glu Leu 


Ala 


Ser Lys 




715 






720 


Gly 


Phe Ala Val Val 


Arg 


Pro 


Pro 


730 




735 




Ala 


Met Gly Phe Cys 


Phe 


Phe 


Asn 


745 




750 






Leu 


Arg Asp Gin Leu 


Asn 


He 


Ser 




765 








Val 


His His Gly Asn 


Gly Thr 


Gin 




780 








He 


Leu Tyr He Ser 


Leu 


His 


Arg 




795 






800 


Gly 


Ser Gly Ala Pro 


Asn 


Glu Val 




810 




815 




Asn 


He Asn He Ala 


Trp Thr Gly 


825 




830 






Val 


Glu Tyr Leu Glu 


Ala 


Phe 


Arg 




845 








Glu 


Phe Asp Pro Asp 


Met 


Val 


Leu 




860 








Glu 


Gly His Thr Pro 


Pro Leu Gly 




875 






880 


Phe 


Gly His Leu Thr 


Lys 


Gin 


Leu 




890 




895 




Val 


Leu Ala Leu Glu 


Gly Gly His 


905 




910 






Ser 


Glu Ala Cys Val 


Asn 


Ala 


Leu 




925 








Ala 


Glu Asp lie Leu 


His 


Gin 


Ser 




940 








Leu 


Gin Lys lie lie 


Glu 


He 


Gin 




955 






960 



<210> 7 

<211> 3367 

<212> DNA 

<213> Homo sapiens 

<400> 7 

ggggaagaga ggcacagaca cagataggag aagggcaccg gctggagcca cttgcaggac 60 
tgagggtttt tgcaacaaaa ccctagcagc ctgaagaact ctaagccaga tggggtggct 120 
ggacgagagc agctcttggc tcagcaaaga atgcacagta tgatcagctc agtggatgtg 180 
aagtcagaag ttcctgtggg cctggagccc atctcacctt tagacctaag gacagacctc 240 
aggatgatga tgcccgtggt ggaccctgtt gtccgtgaga agcaattgca gcaggaatta 300 
cttcttatcc agcagcagca acaaatccag aagcagcttc tgatagcaga gtttcagaaa 3 60 
cagcatgaga acttgacacg gcagcaccag gctcagcttc aggagcatat caaggaactt 420 
ctagccataa aacagcaaca agaactccta gaaaaggagc agaaactgga gcagcagagg 480 
caagaacagg aagtagagag gcatcgcaga gaacagcagc ttcctcctct cagaggcaaa 540 
gatagaggac gagaaagggc agtggcaagt acagaagtaa agcagaagct tcaagagttc 600 
ctactgagta aatcagcaac gaaagacact ccaactaatg gaaaaaatca ttccgtgagc 660 
cgccatccca agctctggta cacggctgcc caccacacat cattggatca aagctctcca 720 
ccccttagtg gaacatctcc atcctacaag tacacattac caggagcaca agatgcaaag 780 
gatgatttcc cccttcgaaa aactgaatcc tcagtcagta gcagttctcc aggctctggt 840 
cccagttcac caaacaatgg gccaactgga agtgttactg aaaatgagac ttcggttttg 900 
ccccctaccc ctcatgccga gcaaatggtt tcacagcaac gcattctaat tcatgaagat 960 
tccatgaacc tgctaagtct ttatacctct ccttctttgc ccaacattac cttggggctt 1020 
cccgcagtgc catcccagct caatgcttcg aattcactca aagaaaagca gaagtgtgag 1080 
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acgcagacgc ttaggcaagg tgttcctctg cctgggcagt atggaggcag catcccggca 1140 
tcttccagcc accctcatgt tactttagag ggaaagccac ccaacagcag ccaccaggct 1200 
ctcctgcagc atttattatt gaaagaacaa atgcgacagc aaaagcttct tgtagctggt 1260 
ggagttccct tacatcctca gtctcccttg gcaacaaaag agagaatttc acctggcatt 1320 
agaggtaccc acaaattgcc ccgtcacaga cccctgaacc gaacccagtc tgcacctttg 1380 
cctcagagca cgttggctca gctggtcatt caacagcaac accagcaatt cfctggagaag 1440 
cagaagcaat accagcagca gatccacatg aacaaactgc tttcgaaatc tattgaacaa 1500 
ctgaagcaac caggcagtca ccttgaggaa gcagaggaag agcttcaggg ggaccaggcg 1560 
atgcaggaag acagagcgcc ctctagtggc aacagcacta ggagcgacag cagtgcttgt 1620 
gtggatgaca cactgggaca agttggggct gtgaaggtca aggaggaacc agtggacagt 1680 
gatgaagatg ctcagatcca ggaaatggaa tctggggagc aggctgcttt tatgcaacag 1740 
cctttcctgg aacccacgca cacacgtgcg ctctctgtgc gccaagctcc gctggctgcg 1800 
gttggcatgg atggattaga gaaacaccgt ctcgtctcca ggactcactc ttcccctgct 1860 
gcctctgttt tacctcaccc agcaatggac cgccccctcc agcctggctc tgcaactgga 1920 
attgcctatg accccttgat gctgaaacac cagtgcgttt gtggcaattc caccacccac 1980 
cctgagcatg ctggacgaat acagagtatc tggtcacgac tgcaagaaac tgggctgcta 2 040 
aataaatgtg agcgaattca aggtcgaaaa gccagcctgg aggaaataca gcttgttcat 2100 
tctgaacatc actcactgtt gtatggcacc aaccccctgg acggacagaa gctggacccc 2160 
aggatactcc taggtgatga ctctcaaaag tttttttcct cattaccttg tggtggactt 2220 
ggggtggaca gtgacaccat ttggaatgag ctacactcgt ccggtgctgc acgcatggct 2280 
gttggctgtg tcatcgagct ggcttccaaa gtggcctcag gagagctgaa gaatgggttt 2340 
gctgttgtga ggccccctgg ccatcacgct gaagaatcca cagccatggg gttctgcttt 2400 
tttaattcag ttgcaattac cgccaaatac ttgagagacc aactaaatat aagcaagata 2460 
ttgattgtag atctggatgt tcaccatgga aacggtaccc agcaggcctt ttatgctgac 2520 
cccagcatcc tgtacatttc actccatcgc tatgatgaag ggaacttttt ccctggcagt 2580 
ggagccccaa atgaggttcg gtttatttct ttagagcccc acttttattt gtatctttca 2640 
ggtaattgca ttgcatgatt acccctaatt ttcttgtcct ttgctggtgt tttaaattac 2700 
acgagattac tgaattgtcc catgggacca agaaccagtg cagaacaagt gcataaccca 2760 
gagcactgtt tgtcagggaa ggttgggctg atttgatgtg ttgtttgatg tttatttcaa 2820 
gagctcccat gtgcttgttt tcctctcttc ttgctttctt ccatttgctc tcttctctgc 2880 
ccaccgtggt gtgtctttct cttcccaggt tggaacaggc cttggagaag ggtacaatat 2940 
aaatattgcc tggacaggtg gccttgatcc tcccatggga gatgttgagt accttgaagc 3 000 
attcaggacc atcgtgaagc ctgtggccaa agagtttgat ccagacatgg tcttagtatc 3060 
tgctggattt gatgcattgg aaggccacac ccctcctcta ggagggtaca aagtgacggc 3120 
aaaatgtttt ggtcatttga cgaagcaatt gatgacattg gctgatggac gtgtggtgtt 3180 
ggctctagaa ggaggacatg atctcacagc catctgtgat gcatcagaag cc tgtgtaaa 3240 
tgcccttcta ggaaatgagc tggagccact tgcagaagat attctccacc aaagcccgaa 3300 
tatgaatgct gttatttctt tacagaagat cattgaaatt caaagtatgt ctttaaagtt 3360 
ctcttaa 3367 

<210> 8 
<211> 835 
<212> PRT 

<213> Homo sapiens 



<400> 8 



Met 


His 


Ser 


Met 


He 


Ser 


Ser 


Val 


Asp 


Val 


Lys Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 








15 




Gly 


Leu 


Glu 


Pro 


lie 


Ser 


Pro 


Leu 


Asp 


Leu 


Arg Thr 


Asp 


Leu 


Arg 


Met 








20 










25 








30 






Met 


Met 


Pro 


Val 


Val 


Asp 


Pro 


Val 


Val. 


Arg 


Glu Lys 


Gin 


Leu 


Gin 


Gin 






35 










40 








45 








Glu 


Leu 


Leu 


Leu 


He 


Gin 


Gin 


Gin 


Gin 


Gin 


He Gin 


Lys 


Gin 


Leu 


Leu 




50 










55 








60 










He 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 








80 


Ala 


Gin 


Leu 


Gin 


Glu 


His 


lie 


Lys 


Glu 


Leu 


Leu Ala 


lie 


Lys 


Gin 


Gin 










85 










90 








95 




Gin 


Glu 


Leu 


Leu 


Glu 


Lys 


Glu 


Gin 


Lys 


Leu 


Glu Gin 


Gin 


Arg 


Gin 


Glu 








100 










105 








110 






Gin 


Glu 


Val 


Glu 


Arg 


His 


Arg 


Arg 


Glu 


Gin 


Gin Leu 


Pro 


Pro 


Leu 


Arg 



115 120 125 
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Gly 


Lys 


Asp 


Arg 


Gly Arg Glu Arg Ala 


Val 


Ala 


Ser 


Thr Glu 


Val Lys 




130 






135 






140 






Gin 


Lys 


Leu 


Gin 


Glu Phe Leu Leu Ser 


Lys 


Ser Ala 


Thr Lys Asp Thr 


145 






150 




155 






160 


Pro 


Thr 


Asn 


Gly 


Lys Asn His Ser Val 


Ser 


Arg His 


Pro Lys 


Leu Trp 










165 


170 








175 


Tyr 


Thr 


Ala 


Ala 


His His Thr Ser Leu 


Asp 


Gin 


Ser 


Ser Pro 


Pro Leu 






180 


185 








190 




Ser 


Gly 


Thr 


Ser 


Pro Ser Tyr Lys Tyr 


Thr 


Leu 


Pro 


Gly Ala Gin Asp 






195 




200 








205 




Ala 


Lys 


Asp 


Asp 


Phe Pro Leu Arg Lys 


Thr 


Glu 


Ser 


Ser Val 


Ser Ser 




210 






215 






220 






Ser 


Ser 


Pro 


Gly 


Ser Gly Pro Ser Ser 


Pro 


Asn 


Asn 


Gly Pro Thr Gly 


225 








230 




235 






240 


Ser 


Val 


Thr 


Glu 


Asn Glu Thr Ser Val 


Leu 


Pro 


Pro 


Thr Pro 


His Ala 










245 


250 








255 


Glu 


Gin 


Met 


Val 


Ser Gin Gin Arg He 


Leu 


He 


His 


Glu Asp 


Ser Met 








260 


265 








270 




Asn 


Leu 


Leu 


Ser 


Leu Tyr Thr Ser Pro 


Ser 


Leu 


Pro 


Asn He 


Thr Leu 






275 




280 








285 




Gly 


Leu 


Pro 


Ala 


Val Pro Ser Gin Leu 


Asn 


Ala 


Ser 


Asn Ser 


Leu Lys 


290 






295 






300 






Glu 


Lys 


Gin 


Lys 


Cys Glu Thr Gin Thr 


Leu 


Arg 


Gin 


Gly Val 


Pro Leu 


305 






310 




315 






320 


Pro 


Gly 


Gin 


Tyr 


Gly Gly Ser He Pro 


Ala 


Ser 


Ser 


Ser His 


Pro His 








325 


330 








335 


Val 


Thr 


Leu 


Glu 


Gly Lys Pro Pro Asn 


Ser 


Ser 


His 


Gin Ala 


Leu Leu 








340 


345 








350 




Gin 


His 


Leu 


Leu 


Leu Lys Glu Gin Met 


Arg 


Gin 


Gin 


Lys Leu 


Leu Val 






355 




360 








365 




Ala 


Gly 


Gly 


Val 


Pro Leu His Pro Gin 


Ser 


Pro 


Leu 


Ala Thr 


Lys Glu 




370 






375 






380 






Arg 


He 


Ser 


Pro 


Gly He Arg Gly Thr 


His 


Lys 


Leu 


Pro Arg 


His Arg 


385 








390 




395 






400 


Pro 


Leu 


Asn 


Arg 


Thr Gin Ser Ala Pro 


Leu 


Pro 


Gin 


Ser Thr 


Leu Ala 










405 


410 








415 


Gin 


Leu 


Val 


He 


Gin Gin Gin His Gin 


Gin 


Phe 


Leu 


Glu Lys 


Gin Lys 








42 0 


425 








430 




Gin 


Tyr 


Gin 


Gin 


Gin He His Met Asn 


Lys 


Leu 


Leu 


Ser Lys 


Ser He 




435 




440 








445 




Glu 


Gin 


Leu 


Lys 


Gin Pro Gly Ser His 


Leu 


Glu 


Glu 


Ala Glu 


Glu Glu 




450 






455 






460 






Leu 


Gin 


Gly 


Asp 


Gin Ala Met Gin Glu 


Asp 


Arg Ala 


Pro Ser 


Ser Gly 


465 




470 




475 






480 


Asn 


Ser 


Thr 


Arg 


Ser Asp Ser Ser Ala 


Cys 


Val Asp 


Asp Thr Leu Gly 










485 


490 








495 


Gin 


Val 


Gly 


Ala 


Val Lys Val Lys Glu 


Glu 


Pro 


Val 


Asp Ser 


Asp Glu 






500 


505 








510 




Asp 


Ala 


Gin 


He 


Gin Glu Met Glu Ser 


Gly 


Glu 


Gin 


Ala Ala 


Phe Met 




515 




520 








525 




Gin 


Gin 


Pro 


Phe 


Leu Glu Pro Thr His 


Thr 


Arg Ala 


Leu Ser Val Arg 




530 






535 






540 






Gin 


Ala 


Pro 


Leu 


Ala Ala Val Gly Met 


Asp 


Gly Leu 


Glu Lys 


His Arg 


545 








550 




555 






560 


Leu 


Val 


Ser 


Arg 


Thr His Ser Ser Pro 


Ala 


Ala 


Ser 


Val Leu 


Pro His 








565 


570 








575 


Pro 


Ala 


Met 


Asp 


Arg Pro Leu Gin Pro 


Gly 


Ser Ala 


Thr Gly He Ala 








580 


585 








590 




Tyr 


Asp 


Pro 


Leu 


Met Leu Lys His Gin 


Cys 


Val Cys 


Gly Asn 


Ser Thr 




595 




600 








605 




Thr 


His 


Pro 


Glu 


His Ala Gly Arg He 


Gin 


Ser 


He 


Trp Ser Arg Leu 




610 






615 






620 
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Gin 


Glu 


Thr 


Glv Tj©u Leu Acjn Tivcj 


625 






630 


Ala 


Cpr 


Leu. 


Glu Glu Tie Gin Tien 








645 




•"Pvt - 


Glv 


Thr Asn P3~o Leu A*;n 








660 


LeU 


LiGli 


Gly 


iiD^/ flDJJ JCi. VjXil Ujf O 






675 


680 


Gly 


Leu 




V?5l A^r* fipr A<5"n Thy 

Vu-i nojj uci rtoy J. ii-L 




690 




695 


vjj. y 


Ala 


Ala 


Arrr Met* Ala Val Glv 


705 






710 


Val 


Ala 




Gl vr Gl n T«on T.\/q Sen 








725 


K3j.y 


Hi c; 


Hi <=: 


riXCl uxu UlU O fcrX 1 1 IX 








740 




Val 


Ala 


11c ILLS, nld ijyb lyi 






755 


760 


Lys 


lie 


Leu 


He Val Asp Leu Asp 




770 




775 


Gin 


Ala 


Phe 


Tyr Ala Asp Pro Ser 


785 






790 


Tyr 


Asp 


Glu 


Gly Asn Phe Phe Pro 








805 


Arg 


Phe 


He 


Ser Leu Glu Pro His 








820 


Cys 


lie 


Ala 








835 
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Cys 


Glu 


Arg 


He Gin Gly Arg Lys 






635 






640 


Val 


His 


Ser 


Gl n Hi c: 
uiu nib 


His 


Ser Leu 




650 








655 


Gly 


Gin 


Lys 


T ,oi i A oio 
lieu nap 


Pro Arg He 


665 








670 




IT 11C 


c lie? 


Ser 


Ser Leu 


Pro 


Cys Gly 








685 






He 




Asn 


Glu Leu 


His 


Ser Ser 








700 






Cys 


Val 


He 


Glu Leu 


Ala 


Ser Lys 






715 






720 


Gly 


Phe 


Ala 


Val Val 


Arg 


Pro Pro 




730 








735 


Ala 
ri -Lei 




Gl v 


Phe Cys 


Phe 


Phe Asn 


745 








750 




Leu 


Arg 


Asp 


Gin Leu 


Asn 


He Ser 








765 






Val 


His 


His 


Gly Asn Gly Thr Gin 








780 






He 


Leu 


Tyr 


He Ser 


Leu 


His Arg 






795 






800 


Gly 


Ser 


Gly 


Ala Pro 


Asn 


Glu Val 




810 








815 


Phe 


Tyr 


Leu 


Tyr Leu 


Ser Gly Asn 


825 








830 





<210> 9 
<211> 1791 
<212> DNA 

<213> Homo sapiens 
<400> 9 

ggggaagaga ggcacagaca cagataggag 
tgagggtttt tgcaacaaaa ccctagcagc 
ggacgagagc agctcttggc tcagcaaaga 
aagtcagaag ttcctgtggg cctggagccc 
aggatgatga tgcccgtggt ggaccctgtt 
cttcttatcc agcagcagca acaaatccag 
cagcatgaga acttgacacg gcagcaccag 
etagecataa aacagcaaca agaactccta 
caagaacagg aagtagagag geategcaga 
gatagaggac gagaaagggc agtggcaagt 
ctactgagta aatcagcaac gaaagacact 
cgccatccca agctctggta cacggctgcc 
ccccttagtg gaacatctcc atcctacaag 
gatgatttcc cccttcgaaa aactgaatcc 
cccagttcac caaacaatgg gccaactgga 
ccccctaccc ctcatgccga gcaaatggtt 
tccatgaacc tgetaagtet ttatacctct 
cccgcagtgc catcccagct caatgetteg 
acgcagacgc ttaggcaagg tgttcctctg 
tcttccagcc accctcatgt tactttagag 
ctcctgcagc atttattatt gaaagaacaa 
ggagttccct tacatcctca gtctcccttg 
agaggtaccc acaaattgee ccgtcacaga 
cc tcagagca cgttggctca gctggtcatt 
cagaagcaat accagcagca gatccacatg 
ctgaagcaac caggcagtca ccttgaggaa 



aagggcaccg getggageca ettgeaggae 60 
ctgaagaact etaagecaga tggggtggct 120 
atgcacagta tgatcagctc agtggatgtg 180 
atctcacctt tagacctaag gacagacctc 240 
gtccgtgaga ageaattgea gcaggaatta 3 00 
aagcagcttc tgatagcaga gtttcagaaa 360 
gctcagcttc aggagcatat caaggaactt 420 
gaaaaggagc agaaactgga gcagcagagg 480 
gaacagcagc ttcctcctct cagaggcaaa 540 
acagaagtaa agcagaagct tcaagagttc 600 
ccaactaatg gaaaaaatca ttccgtgagc 660 
caccacacat cattggatca aagctctcca 720 
tacacattac caggagcaca agatgeaaag 780 
tcagtcagta gcagttctcc aggctctggt 840 
agtgttactg aaaatgagac ttcggttttg 900 
tcacagcaac gcattctaat tcatgaagat 960 
ccttctttgc ccaacattac cttggggctt 1020 
aattcactca aagaaaagca gaagtgtgag 1080 
cctgggcagt atggaggcag catcccggca 1140 
ggaaagecac ccaacagcag ccaccaggct 1200 
atgegacage aaaagcttct tgtagctggt 1260 
gcaacaaaag agagaatttc acctggcatt 1320 
cccctgaacc gaacccagtc tgcacctttg 1380 
caacagcaac accagcaatt cttggagaag 1440 
aacaaactgc tttcgaaatc tattgaacaa 1500 
gcagaggaag agcttcaggg ggaccaggcg 1560 
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atgcaggaag acagagcgcc ctctagtggc aacagcacta ggagcgacag cagtgcttgt 1620 
gtggatgaca cactgggaca agttggggct gtgaaggtca aggaggaacc agtggacagt 1680 
gatgaagatg ctcagatcca ggaaatggaa tctggggagc aggctgcttt tatgcaacag 1740 
gtaataggca aagatttagc tccaggattt gtaattaaag tcattatctg a 1791 

<210> 10 

<211> 546 

<212> PRT 

<213> Homo sapiens 



<400> 10 



Met 


His 


Ser 


Met 


lie 


Ser 


Ser 


Val 


Asp 


Val 


Lys 


Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 










15 




Gly 


Leu 


Glu 


Pro 


lie 


Ser 


Pro 


Leu 


Asp 


Leu 


Arg 


Thr 


Asp 


Leu 


Arg 


Met 






20 










25 










30 






Met 


Met 


Pro 
35 


Val 


Val 


Asp 


Pro 


Val 
40 


Val 


Arg 


Glu 


Lys 


Gin 
45 


Leu 


Gin 


Gin 


Glu 


Leu 
50 


Leu 


Leu 


lie 


Gin 


Gin 
55 


Gin 


Gin 


Gin 


He 


Gin 
60 


Lys 


Gin 


Leu 


Leu 


lie 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu 


Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 










80 


Ala 


Gin 


Leu 


Gin 


Glu 

85 


His 


He 


Lys 


Glu 


Leu 
90 


Leu 


Ala 


He 


Lys 


Gin 
95 


Gin 


Gin 


Glu 


Leu 


Leu 
100 


Glu 


Lys 


Glu 


Gin 


Lys 
105 


Leu 


Glu 


Gin 


Gin 


Arg 
110 


Gin 


Glu 


Gin 


Glu 


Val 
115 


Glu 


Arg 


His 


Arg 


Arg 
120 


Glu 


Gin 


Gin 


Leu 


Pro 
125 


Pro 


Leu 


Arg 


Gly 


Lys 
130 


Asp 


Arg 


Gly 


Arg 


Glu 
135 


Arg 


Ala 


Val 


Ala 


Ser 
140 


Thr 


Glu 


Val 


Lys 


Gin 


Lys 


Leu 


Gin 


Glu 


Phe 


Leu 


Leu 


Ser 


Lys 


Ser 


Ala 


Thr 


Lys 


Asp 


Thr 


145 








150 










155 










160 


Pro 


Thr 


Asn 


Gly 


Lys 


Asn 


His 


Ser 


Val 


Ser 


Arg 


His 


Pro 


Lys 


Leu 


Trp 








165 










170 










175 




Tyr 


Thr 


Ala 


Ala 


His 


His 


Thr 


Ser 


Leu 


Asp 


Gin 


Ser 


Ser 


Pro 


Pro 


Leu 






180 










185 










190 






Ser 


Gly 


Thr 
195 


Ser 


Pro 


Ser 


Tyr 


Lys 
200 


Tyr 


Thr 


Leu 


Pro 


Gly 


Ala 


Gin 


Asp 


Ala 


Lys 
210 


Asp 


Asp 


Phe 


Pro 


Leu 
215 


Arg 


Lys 


Thr 


Glu 


Ser 

O O f\ 


Ser 


vai 


Ser 


Ser 


Ser 


Ser 


Pro 




Ser 


oJ.y 


Pro 


C* «"> y 
OCX 


C Q V* 




Asn 


Asn 




Pro 


Thr 


Gly 


225 








230 










235 










240 


Ser 


Val 


Thr 


Glu 


Asn 
245 


Glu 


Thr 


Ser 


Val 


Leu 
250 


Pro 


Pro 


Thr 


Pro 


His 
255 


Ala 


Glu 


Gin 


Met 


Val 
260 


Ser 


Gin 


Gin 


Arg 


He 
265 


Leu 


He 


His 


Glu 


Asp 
270 


Ser 


Met 


Asn 


Leu 


Leu 
275 


Ser 


Leu 


Tyr 


Thr 


Ser 
280 


Pro 


Ser 


Leu 


Pro 


Asn 
285 


He 


Thr 


Leu 


Gly 


Leu 


Pro 


Ala 


Val 


Pro 


Ser 


Gin 


Leu 


Asn 


Ala 


Ser 


Asn 


Ser 


Leu 


Lys 


290 










295 










300 










Glu 


Lys 


Gin 


Lys 


Cys 


Glu 


Thr 


Gin 


Thr 


Leu 


Arg 


Gin 


Gly 


Val 


Pro 


Leu 


305 










310 










315 










320 


Pro 


Gly 


Gin 


Tyr 


Gly 
325 


Gly 


Ser 


He 


Pro 


Ala 
330 


Ser 


Ser 


Ser 


His 


Pro 
335 


His 


Val 


Thr 


Leu 


Glu 


Gly 


Lys 


Pro 


Pro 


Asn 


Ser 


Ser 


His 


Gin 


Ala 


Leu 


Leu 








340 








345 










350 






Gin 


His 


Leu 


Leu 


Leu 


Lys 


Glu 


Gin 


Met 


Arg 


Gin 


Gin 


Lys 


Leu 


Leu Val 






355 








360 










365 








Ala 


Gly 


Gly 


Val 


Pro 


Leu 


His 


Pro 


Gin 


Ser 


Pro 


Leu 


Ala 


Thr 


Lys 


Glu 




370 








375 










380 










Arg 


lie 


Ser 


Pro 


Gly 


lie 


Arg 


Gly 


Thr 


His 


Lys 


Leu 


Pro 


Arg 


His 


Arg 


385 










390 










395 
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Pro 


Leu 


Asn 
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Thr 


Gin 


Ser 


Ala 


Pro 


Leu 


Pro 


Gin 


Ser 


Thr 


Leu 
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44U 


Glu 
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Gin 


Leu Lys Gin 


Pro Gly 


Ser 




/ c a 
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4bb 




Leu 


bin 
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Jll - TUT/--V f_ 

Ala Met 


bin 


465 






470 




Asn 


Ser 


Thr Arg Ser 


Asp Ser 


Ser 






485 






Gin 


Val 


Gly Ala Val 


Lys Val 


Lys 






500 






Asp 


Ala 


Gin He Gin 


Glu Met 


Glu 






515 




520 


Gin 


Gin 


Val He Gly 


Lys Asp 


Leu 




530 




535 




He 


He 








545 
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<400> 11 



Met 


His 


Ser 


Met 


He 


Ser Ser 


Val 


1 








5 






Gly 


Leu 


Glu 


Pro 


He 


Ser Pro 


Leu 








20 








Met 


Met 


Pro 


Val 


Val 


Asp Pro 


Val 






35 








40 


Glu 


Leu 


Leu 


Leu 


He 


Gin Gin 


Gin 




50 








55 




He 


Ala 


Glu 


Phe 


Gin 


Lys Gin 


His 


65 










70 




Ala 


Gin 


Leu 


Gin 


Glu 


His He 


Lys 










85 






Gin 


Glu 


Leu 


Leu 


Glu 


Lys Glu 


Gin 








100 








Gin 


Glu 


Val 


Glu 


Arg 


His Arg 


Arg 






115 








120 


Gly 


Lys 


Asp 


Arg 


Gly 


Arg Glu 


Arg 




130 








135 




Gin 


Lys 


Leu 


Gin 


Glu 


Phe Leu 


Leu 


145 










150 




Pro 


Thr 


Asn 


Gly 


Lys 


Asn His 


Ser 










165 






Tyr 


Thr 


Ala 


Ala 


His 


His Thr 


Ser 








180 








Ser 


Gly 


Thr 


Ser 


Pro 


Ser Tyr 


Lys 






195 








200 


Ala 


Lys 


Asp 


Asp 


Phe 


Pro Leu 


Arg 




210 








215 




Lys 


Val 


Arg 


Ser 


Arg 


Leu Lys 


Gin 


225 










230 




Pro 


Leu 


Leu 


Arg 


Arg 


Lys Asp 


Gly 










245 






Arg 


Met 


Phe 


Glu 


Val 


Thr Glu 


Ser 






260 








Ser 


Gly 


Pro 


Ser 


Ser 


Pro Asn 


Asn 
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ai n 

f± J.U 




4J.O 




p-l — 

bin 


p i 
bin 


Fne i»eu biu Lys 


ni 

bin 


Lys 


A o c. 

4/b 




a i n 






Asn 


Lys 


Leu Leu Ser Lys 


ber 


lie 






A A 
44b 






His 


Leu 


Glu Glu Ala Glu 


Glu 


Glu 






460 






Glu 


Asp 


Arg Ala Pro Ser 


Ser 


Gly 






475 




480 


Ala 


Cys 


Val Asp Asp Thr 


Leu 


Gly 




/ion 




/rye 

4yb 




Glu 


Glu 


Pro Val Asp Ser 


Asp 


Glu 


505 




510 






Ser 


Gly 


Glu Gin Ala Ala 


Phe 


Met 






525 






Ala 


Pro 


Gly Phe Val He 


Lys 


Val 






C A A 
b4U 






Asp 


Val 


Lys Ser Glu Val 


Pro 


TT- T 

vai 




i a 
1U 




1 c 

lb 




Asp 


Leu 


Arg Thr Asp Leu 


Arg 


Met 


25 




■a a 






Val 


Arg 


Glu Lys Gin Leu 


Gin 


Gin 












Gin 


Gin 


lie Gin Lys Gin 


Leu 


Leu 






o U 






Glu 


Asn 


Leu Thr Arg Gin 


His 


bin 










Q A 

ou 


Glu 


Leu 


Leu Ala lie Lys 


Gin 


pi _ 

Gin 








yb 




Lys 


Leu 


Glu Gin Gin Arg 


nl — . 

Gin 


G1U 


105 




1 1 A 
11U 






blU 


bin 


Gin Leu Pro Pro 


Leu 


Arg 






Izb 






Ala 


TT- 1 

vai 


Ala ber Tnr biu 


vai 


Lys 






±4U 






Ser 


Lys 


ber Aia inr iiys 


Asp 


inr 






ibb 






"Mai 

vol 


Ser 


Arg nla rIO Jjys 


Leu 


Trp 




JL / U 




X / 3 




Leu 


Asp 


bin ber ber rro 


Pro 


Leu 


185 




190 






Tyr 


Thr 


Leu Pro Gly Ala 


Gin 


Asp 






2 05 






Lys 


Thr 


Ala Ser Glu Pro 


Asn 


Leu 






220 






Lys 


Val 


Ala Glu Arg Arg 


Ser 


Ser 






235 




240 


Asn 


Val 


Val Thr Ser Phe 


Lys 


Lys 




250 




255 




Ser 


Val 


Ser Ser Ser Ser 


Pro 


Gly 


265 




270 






Gly 


Pro 


Thr Gly Ser Val 


Thr 


Glu 



WO 02/102984 



PCT/US02/19051 



16/25 







one 
Aid 








280 


Asn 


CjlU 


Thr 


Ser 


Val 


Leu 


Pro Pro 














295 


Ser 


Gin 


Gin 


Arg 


He 


Leu 


He His 


*3 f\ C 

305 










310 




Leu 


Tyr 


Thr 


Ser 


Pro 


Ser 


Leu Pro 










325 






Val 


Pro 


ser 


Gin 


Leu 


Asn 


Ala Ser 








^ /in 








Cys 


Glu 


Thr 


Gin 


Thr Leu Arg Gin 




o c c 
355 








360 


Gly 


Gly 


Ser 


lie 


Pro 


Ala 


Ser Ser 




"> ""7 A 

370 










375 


Gly 


Lys 


Pro 


Pro 


Asn 


Ser 


Ser His 


385 










390 




Leu 


Lys 


Glu 


Gin 


Met 


Arg 


Gin Gin 










405 






Pro 


Leu 


His 


Pro 


Gin 


Ser 


Pro Leu 








420 








Gly 


lie 


Arg 


Gly 


Thr 


His 


Lys Leu 






A o cr 

435 








440 


Thr 


Gin 


Ser 


Ala 


Pro 


Leu 


Pro Gin 




450 










455 


Gin 


Gin 


Gin 


His 


Gin 


Gin 


Phe Leu 


465 










470 




Gin 


lie 


His 


Met 


Asn 


Lys 


Leu Leu 










485 






Gin 


Pro 


Gly 


Ser 


His 


Leu 


Glu Glu 








500 








Gin 


Ala 


Met 


Gin 


Glu Asp Arg Ala 






515 








520 


Ser 


Asp 


Ser 


Ser 


Ala Cys Val Asp 




530 










535 


Val 


Lys 


Val 


Lys 


Glu 


Glu 


Pro Val 


545 










'550 




Gin 


Glu 


Met 


Glu 


Ser 


Gly Glu Gin 










565 






Gly 


Lys 


Asp 


Leu 


Ala Pro Gly Phe 








580 
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Met 


Ser 


Ser 


Gin 


Ser His Pro Asp 
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Val 


Glu 


Leu 
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Asn Pro Ala Arg 
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Asp 


Val 


Ala 


Thr 


Ala Leu Pro Leu 






35 




40 


Met 


Asp 


Leu 


Arg 


Leu Asp His Gin 




50 






55 


Ala 


Leu 


Arg 


Glu 
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65 








70 


Lys 


Gin 


Gin 


He 
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85 


His 


Glu 


Gin 


Leu 


Ser Arg Gin His 
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Lys 


Gin 


Gin 
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Met 


Val 
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335 
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Ser 
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ft J u 
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His Arg 


Pro 
1 1 \J 
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Asn 
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Ser 


1 ILL 
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Tl p 

11C 






a c n 
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Lys 


Gin Lys 
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Tyr 


Pi n 
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4 /D 








Aft fi 
*± O b 


Ser 


Lys 


Co-*- Tin 

ber lie 


PI n 
WIU 


Pin 
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Leu 


Lys 




490 








495 




Ala 


Glu 


Glu Glu 


Leu 


Gin 


Gly 


Asp 


505 
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Pro 


Ser 


Ser Gly 


Asn 


Ser 


Thr 


Arg 








D A D 








Asp 


Thr 


Leu Gly 


Gin 


Val 


Gly 


Ala 






540 










Asp 


Ser 


Asp Glu 


Asp 


Ala 
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Tl O 
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Asp 


Gin 


Pro 




10 
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Val 


Asn 


His Met 
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Ser 


Thr 


Val 


25 








30 






Gin 


Val 


Ala Pro 


Ser 


Ala 


Val 


Pro 








45 








Phe 


Ser 


Leu Pro 


Val 


Ala 


Glu 


Pro 






60 










Gin 


Glu 
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Ala 
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Lys 


Gin 






75 








80 


Leu 


He 


Ala Glu 


Phe 


Gin 


Arg 


Gin 




90 
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Glu 


Ala 
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Glu 
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He 


105 








110 






Met 


Lys 


His Gin 


Gin 


Glu 


Leu 


Leu 
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115 








120 










125 








b lu 


nib 


Gin 


Arg 


Lys 


Leu 


Glu Arg 


His 




Gin 


Glu 


Gin 


pi n 


Leu 


pi ii 

blU 




-LOU 










135 








140 










Lys 


Pi n 


His 


Arg 


Glu Gin Lys Leu 




pi n 


T.oi i 


Lys Asn 


Lys 
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blU 


Lys 


1 AS 










150 
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i fin 
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Lys 


Glu 
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Ala 
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Ala Ser 
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ul Li 
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Leu 
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.Val 


Leu 


Asn Lys 


Lys Lys 


rxJLCL 


Leu 


nld 


His Arg 


Asn 


Leu 


Asn 
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Leu 
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Arg 
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Lys 
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Asp 
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285 








Asp 


• 

Ser 


Ala 
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Ser 
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Pro 
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Asn 
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He 
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Pro 








340 
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Thr 


Leu 
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355 
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Pi Y-l 
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Pro Gly Thr His 


Leu 


rp"U v 

inr 


Pro 


Tyr 
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inr 
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390 
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Leu 


Glu 'Arg 


Asp Gly Gly Ala 


Ala 
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Leu 


Leu 


Pi n 

bin 
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405 
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Gin 


Pro Pro 


Ala 


pi i-i 
bin 


Al a 

Ala 


Pro 


Leu 
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Thr 


Pi 17 

biy 








420 


















a ^ n 

ft j u 






Leu 




Ala 


Leu 


Pro 


Leu 


His Ala 


pi ti 
bin 


Ser 


Leu 


Val Gly 


Ala 
Ala 


Asp 


Arg 






435 








440 










445 








vdl 


Ser 


Pro 


Ser 


He 


His 


Lys Leu 


Arg 


Pi n 

bin 




Arg 


Pro 


Leu 


Pi "vr 

biy 


Arg 




ar n 










455 








460 










1 XlxT 




Ser 


Ala 


Pro 


Leu 


Pro Gin 


Asn 


Ala 


Pi Tl 

bin 


Ala 


Leu 


Pin 

bin 


nlS 


Leu 


Afi R 










470 








A7 R 










Aan 

fi O U 


Val 
val 


Tl ~ 

lie 


Gin 


Gin 


Gin 
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Gin Gin 


irne 


Leu 


PI n 
blu 


Lys His 




Pi n 


PI n 










485 








AQn 










AQR 

ft -/ J 




lrJ.lt; 


Pi n 


Gin 


Gin 


Gin 


Leu 


Gin Met 


Asn 


Lys 


Tl o 

lie 


He 


Pro 




IT J. \J 


Cor 








500 








DUO 










si n 








pyn 

4TX w 


Ala Arg 


Gin 
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Glu Ser 
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C J. KJ 
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V3l U. 


Glu 


Thr 


Glu 


Pill 

\J7-L LL 


Pi 11 






515 








520 










525 








Leu 


Arg 


Glu 


His 


Gin 


Ala 


Leu Leu 


Asp 


Glu 


Pro 


Tyr 


Leu 


Asp 


Arg 


Leu 




530 










535 








540 










Pro 


Gly 


Gin 


Lys 


Glu 


Ala 


His Ala 


Gin 


Ala 


Gly 


Val 


Gin 


Val 


Lys 


Gin 


545 










550 








555 










560 


Glu 


Pro 


He 


Glu 


Ser Asp Glu Glu 


Glu 


Ala 


Glu 


Pro 


Pro 


Arg 


Glu 


Val 










565 








570 










575 




Glu 


Pro 


Gly Gin 


Arg 


Gin 


Pro Ser 


Glu 


Gin 


Glu 


Leu 


Leu 


Phe 


Arg 


Gin 








580 








585 










590 






Gin 


Ala 


Leu 


Leu 


Leu 


Glu 


Gin Gin 


Arg 


He 


His 


Gin Leu 


Arg 


Asn 


Tyr 






595 
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Gin 


Ala 


Ser 


Met 


Glu Ala Ala Gly 


lie 
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Val 


Ser 


Phe 


Gly 


Gly 


His 
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610 






615 










DZ U 




Arg 


Pro 


Leu 


Ser Arg Ala Gin 


Ser 


Ser 


Pro 


7\ I a 
Jn.±a 


OCX ±\-LCL ± i. 1 J. 


Phe Pro 


625 








630 








KIR 




640 


Val 


Ser 


Val 


Gin 


Glu Pro Pro 


Thr 


Lys 


Pro 


Arg 


Jrne 1IJX 1X1X. 












645 






cert 
ODU 






655 


Val 


Tyr 


Asp 


Thr 


Leu Met Leu 


Lys 


illS 


uin 


Cys 


nix tys oj.y 


Coy* 




660 






Oo j 






670 




Ser 


Ser 


His 


Pro 


Glu His Ala 


Gly 


Arg 


lie 


vjin 


061 J.±e i-j-p 


dor Av-rr 






675 






boU 








QOJ 




Leu 


Gin 


Glu 


Thr Gly Leu Arg 


Gly 


T •« rt-t 

Lys 


Cys 


IjlU 


L.yS lie AXy 


/~* "1 -i7- 7A yrf 
vj±y nl y 




690 






695 














Lys 


Ala 


Thr 


Leu 


Glu Glu Leu 


Gin 


Thr 


vai 


HIS 


oci tj-LU Ala. 


705 








710 








/ ID 




7?n 


Leu 


Leu 


Tyr 


Gly Thr Asn Pro 


Leu 


Asn 


Arg 


bin 


ijys lieu Asp 


Cor T.17G 








725 






/iU 






Tic 


Lys 


Leu 


Leu 


Gly Ser Leu Ala 


Ser 


Val 


Phe 


Val 


Arg Leu Pro 








740 






/4b 










Gly 


Val 


Gly 


Val Asp Ser Asp 


Thr 


lie 


Trp 


Asn 


(jiu vai tils 


- 

Ser Ala 




755 






H d f\ 

760 








/ (3D 




Gly 


Ala 


Ala 


Arg Leu Ala Val 


Gly 


Cys 


Val 


vai 


\jiu Lieu vai 


Jriie J-iy b 


770 






775 










/ OU 




Val 


Ala 


Thr 


Gly Glu Leu Lys 


Asn 


Gly 


Phe 


Ala 


Vai vai Arg 


Fro irxO 


785 








790 








/ y d 




800 


Gly 


His 


His 


Ala 


Glu Glu Ser 


Thr 


Pro 


Met 


Gly 


Phe Cys Tyr 


irne Asn 








805 






olU 






R 1 ^ 
ulO 


Ser 


Val 


Ala 


Val 


Ala Ala Lys 


Leu 


Leu 


Gin 


Gin 


Arg Leu Ser 


vai oci 








820 






o o c 
OZD 










Lys 


lie 


Leu 


lie Val Asp Trp 


Asp 


val 


His 


HIS 


tjiy asii oiy 


i hit vjiii 




835 






Q A f\ 








Q /I C 




Gin 


Ala 


Phe 


Tyr 


Ser Asp Pro 


Ser 


Val 


Leu 


Tyr 


Men faer jbeu 


nis /ixy 




850 




855 










ODU 




Tyr 


Asp 


Asp 


Gly 


Asn Phe Phe 


Pro 


Gly 


Ser 


Gly 


Ala Pro Asp 


pin "tral 

Liiu vai 


865 








870 








Q 1 K 




oo u 


Gly 


Thr 


Gly 


Pro 


Gly Val Gly 


Phe 


Asn 


Val 


Asn 


Met Ala pne 


j. nr v^iy 






885 






O Q Pi 

oy U 








Gly 


Leu 


Asp 


Pro 


Pro Met Gly 


Asp 


Ala 


Glu 


Tyr 


Leu Ala Ala 


Fae Arg 




900 












y iu 




Thr 


Val 


Val 


Met 


Pro lie Ala 


Ser 


Glu 


Phe 


Ala 


Pro Asp Val 


vai lieu 






915 






920 












Val 


Ser 


Ser 


Gly 


Phe Asp Ala 


Val 


Glu 


Gly 


rilS 


Pro Thr Pro 


lieu oJ.y 




930 






935 














Gly 


Tyr 


Asn 


Leu 


Ser Ala Arg 


Cys 


Phe 


Gly 


Tyr 


Leu Thr Lys 


Gin Leu 


945 






950 








955 




J D U 


Met 


Gly 


Leu 


Ala 


Gly Gly Arg 


He 


Val 


Leu 


Ala 


Leu Glu Gly 


uiy xiiD 








965 






970 






j / D 


Asp 


Leu 


Thr 


Ala 


lie Cys Asp 


Ala 


Ser 


Glu 


Ala 


Cys Val Ser 


Ala Leu 






980 






985 






990 




Leu 


Gly 


Asn 


Glu 


Leu Asp Pro 
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<211> 3550 

<212> DNA 

<213> Homo sapiens 

<400> 13 

ggggaagaga ggcacagaca cagataggag aagggcaccg gctggagcca cttgcaggac 60 
tgagggtttt tgcaacaaaa ccctagcagc ctgaagaact ctaagccaga tggggtggct 120 
ggacgagagc agctcttggc tcagcaaaga atgcacagta tgatcagctc agtggatgtg 180 
aagtcagaag ttcctgtggg cctggagccc atctcacctt taga'cctaag gacagacctc 240 
aggatgatga tgcccgtggt ggaccctgtt gtccgtgaga agcaattgca gcaggaatta 3 00 
cttcttatcc agcagcagca acaaatccag aagcagcttc tgatagcaga gtttcagaaa 360 
cagcatgaga acttgacacg gcagcaccag gctcagcttc aggagcatat caaggaactt 420 
ctagccataa aacagcaaca agaactccta gaaaaggagc agaaactgga gcagcagagg 480 
caagaacagg aagtagagag gcatcgcaga gaacagcagc ttcctcctct cagaggcaaa 540 
gatagaggac gagaaagggc agtggcaagt acagaagtaa agcagaagct tcaagagttc 600 
ctactgagta aatcagcaac gaaagacact ccaactaatg gaaaaaatca ttccgtgagc 660 
cgccatccca agctctggta cacggctgcc caccacacat cattggatca aagctctcca 720 
ccccttagtg gaacatctcc atcctacaag tacacattac caggagcaca agatgcaaag 780 
gatgatttcc cccttcgaaa aactgcctct gagcccaact tgaaggtgcg gtccaggtta 840 
aaacagaaag tggcagagag gagaagcagc cccttactca ggcggaagga tggaaatgtt 900 
gtcacttcat tcaagaagcg aatgtttgag gtgacagaat cctcagtcag tagcagttct 960 
ccaggctctg gtcccagttc accaaacaat gggccaactg gaagtgttac tgaaaatgag 1020 
acttcggttt tgccccctac ccctcatgcc gagcaaatgg tttcacagca acgcattcta 1080 
attcatgaag attccatgaa cctgctaagt ctttatacct ctccttcttt gcccaacatt 1140 
accttggggc ttcccgcagt gccatcccag ctcaatgctt cgaattcact caaagaaaag 1200 
cagaagtgtg agacgcagac gcttaggcaa ggtgttcctc tgcctgggca gtatggaggc 1260 
agcatcccgg catcttccag ccaccctcat gttactttag agggaaagcc acccaacagc 1320 
agccaccagg ctctcctgca gcatttatta ttgaaagaac aaatgcgaca gcaaaagctt 1380 
cttgtagctg gtggagttcc cttacatcct cagtctccct tggcaacaaa agagagaatt 1440 
tcacctggca ttagaggtac ccacaaattg ccccgtcaca gacccctgaa ccgaacccag 1500 
tctgcacctt tgcctcagag cacgttggct cagctggtca ttcaacagca acaccagcaa 1560 
ttcttggaga agcagaagca ataccagcag cagatccaca tgaacaaact gctttcgaaa 1620 
tctattgaac aactgaagca accaggcagt caccttgagg aagcagagga agagcttcag 1680 
ggggaccagg cgatgcagga agacagagcg ccctctagtg gcaacagcac taggagcgac 1740 
agcagtgctt gtgtggatga cacactggga caagttgggg ctgtgaaggt caaggaggaa 1800 
ccagtggaca gtgatgaaga tgctcagatc caggaaatgg aatctgggga gcaggctgct 1860 
tttatgcaac aggtaatagg caaagattta gctccaggat ttgtaattaa agtcattatc 1920 
tgacctttcc tggaacccac gcacacacgt gcgctctctg tgcgccaagc tccgctggct 1980 
gcggttggca tggatggatt agagaaacac cgtctcgtct ccaggactca ctcttcccct 2040 
gctgcctctg ttttacctca cccagcaatg gaccgccccc tccagcctgg ctctgcaact 2100 
ggaattgcct atgacccctt gatgctgaaa caccagtgcg tttgtggcaa ttccaccacc 2160 
caccctgagc atgctggacg aatacagagt atctggtcac gactgcaaga aactgggctg 2220 
ctaaataaat gtgagcgaat tcaaggtcga aaagccagcc tggaggaaat acagcttgtt 2280 
cattctgaac atcactcact gttgtatggc accaaccccc tggacggaca gaagctggac 2340 
cccaggatac tcctaggtga tgactctcaa aagttttttt cctcattacc ttgtggtgga 2400 
cttggggtgg acagtgacac catttggaat gagctacact cgtccggtgc tgcacgcatg 2460 
gctgttggct gtgtcatcga gctggcttcc aaagtggcct caggagagct gaagaatggg 2520 
tttgctgttg tgaggccccc tggccatcac gctgaagaat ccacagccat ggggttctgc 2580 
ttttttaatt cagttgcaat taccgccaaa tacttgagag accaactaaa tataagcaag 2 640 
atattgattg tagatctgga tgttcaccat ggaaacggta cccagcaggc cttttatgct 2700 
gaccccagca tcctgtacat ttcactccat cgctatgatg aagggaactt tttccctggc 2760 
agtggagccc caaatgaggt tcggtttatt tctttagagc cccactttta tttgtatctt 2820 
tcaggtaatt gcattgcatg attaccccta attttcttgt cctttgctgg tgttttaaat 2880 
tacacgagat tactgaattg tcccatggga ccaagaacca gtgcagaaca agtgcataac 2940 
ccagagcact gtttgtcagg gaaggttggg ctgatttgat gtgttgtttg atgtttattt 3000 
caagagctcc catgtgcttg ttttcctctc ttcttgcttt cttccatttg ctctcttctc 3060 
tgcccaccgt ggtgtgtctt tctcttccca ggttggaaca ggccttggag aagggtacaa 3120 
tataaatatt gcctggacag gtggccttga tcctcccatg ggagatgttg agtaccttga 3180 
agcattcagg accatcgtga agcctgtggc caaagagttt gatccagaca tggtcttagt 3240 
atctgctgga tttgatgcat tggaaggcca cacccctcct ctaggagggt acaaagtgac 3300 
ggcaaaatgt tttggtcatt tgacgaagca attgatgaca ttggctgatg gacgtgtggt 3360 
gttggctcta gaaggaggac atgatctcac agccatctgt gatgcatcag aagcctgtgt 3420 
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aaatgccctt ctaggaaatg agctggagcc acttgcagaa gatattctcc accaaagccc 3480 
gaatatgaat gctgttattt ctttacagaa gatcattgaa attcaaagta tgtctttaaa 3540 
gttctcttaa 3550 

<210> 14 

<211> 7699 . 

<212> DNA 

<213> Homo sapiens 

<400> 14 

cccattcgcc attcaggctg cgcaactgtt gggaagggcg atcggtgcgg gcctcttcgc 60 
tattacgcca gctggcgaaa gggggatgtg ctgcaaggcg attaagttgg gtaacgccca 120 
gggttttccc agtcacgacg ttgtaaaacg acggccagtg ccaagctgat ctaatcaata 180 
ttggccatta gccatattat tcattggtta tatagcataa atcaatattg gctattggcc 240 
attgcatacg ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt 300 
accgccatgt tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt 360 
agttcatagc ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg 420 
cgaccgccca gcgacccccg cccgttgacg tcaatagtga cgtatgttcc catagtaacg 480 
ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac tgcccacttg 540 
gcagtacatc aagtgtatca tatgccaagt ccgcccccta ttgacgtcaa tgacggtaaa 600 
tggcccgcct agcattatgc ccagtacatg accttacggg agtttcctac ttggcagtac 660 
atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta caccaatggg 720 
cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga cgtcaatggg 780 
agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaataa ccccgccccg 840 
ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag agctcgttta 900 
gtgaaccgtc agaattcaag cttgcggccg cagatctatc gatctgcagg atatcaccat 960 
gcacagtatg atcagctcag tggatgtgaa gtcagaagtt cctgtgggcc tggagcccat 1020 
ctcaccttta gacctaagga cagacctcag gatgatgatg cccgtggtgg accctgttgt 1080 
ccgtgagaag caattgcagc aggaattact tcttatccag cagcagcaac aaatccagaa 1140 
gcagcttctg atagcagagt ttcagaaaca gcatgagaac ttgacacggc agcaccaggc 1200 
tcagcttcag gagcatatca aggaacttct agccataaaa cagcaacaag aactcctaga 1260 
aaaggagcag aaactggagc agcagaggca agaacaggaa gtagagaggc atcgcagaga 1320 
acagcagctt cctcctctca gaggcaaaga tagaggacga gaaagggcag tggcaagtac 1380 
agaagtaaag cagaagcttc aagagttcct actgagtaaa tcagcaacga aagacactcc 1440 
aactaatgga aaaaatcatt ccgtgagccg ccatcccaag ctctggtaca cggctgccca 1500 
ccacacatca ttggatcaaa gctctccacc ccttagtgga acatctccat cctacaagta 1560 
cacattacca ggagcacaag atgcaaagga tgatttcccc cttcgaaaaa ctgcctctga 162 0 
gcccaacttg aaggtgcggt ccaggttaaa acagaaagtg gcagagagga gaagcagccc 1680 
cttactcagg cggaaggatg gaaatgttgt cacttcattc aagaagcgaa tgtttgaggt 1740 
gacagaatcc tcagtcagta gcagttctcc aggctctggt cccagttcac caaacaatgg 1800 
gccaactgga agtgttactg aaaatgagac ttcggttttg ccccctaccc ctcatgccga 1860 
gcaaatggtt tcacagcaac gcattctaat tcatgaagat tccatgaacc tgctaagtct 1920 
ttatacctct ccttctttgc ccaacattac cttggggctt cccgcagtgc catcccagct 1980 
caatgcttcg aattcactca aagaaaagca gaagtgtgag acgcagacgc ttaggcaagg 2040 
tgttcctctg cctgggcagt atggaggcag catcccggca tcttccagcc accctcatgt 2100 
tactttagag ggaaagccac ccaacagcag ccaccaggct ctcctgcagc atttattatt 2160 
gaaagaacaa atgcgacagc aaaagcttct tgtagctggt ggagttccct tacatcctca 2220 
gtctcccttg gcaacaaaag agagaatttc acctggcatt agaggtaccc acaaattgcc 2280 
ccgtcacaga cccctgaacc gaacccagtc tgcacctttg cctcagagca cgttggctca 2340 
gctggtcatt caacagcaac accagcaatt cttggagaag cagaagcaat accagcagca 2400 
gatccacatg aacaaactgc tttcgaaatc tattgaacaa ctgaagcaac cagcrcagtca 2460 
ccttgaggaa gcagaggaag agcttcaggg ggaccaggcg atgcaggaag acagagcgcc 2520 
ctctagtggc aacagcacta ggagcgacag cagtgcttgt gtggatgaca cactgggaca 2580 
agttggggct gtgaaggtca aggaggaacc agtggacagt gatgaagatg ctcagatcca 2640 
ggaaatggaa tctggggagc aggctgcttt tatgcaacag cctttcctgg aacccacgca 2700 
cacacgtgcg ctctctgtgc gccaagctcc gctggctgcg gttggcatgg atggattaga 2760 
gaaacaccgt ctcgtctcca ggactcactc ttcccctgct gcctctgttt tacctcaccc 2820 
agcaatggac cgccccctcc agcctggctc tgcaactgga attgcctatg accccttgat 2880 
gctgaaacac cagtgcgttt gtggcaattc caccacccac cctgagcatg ctggacgaat 2940 
acagagtatc tggtcacgac tgcaagaaac tgggctgcta aataaatgtg agcgaattca 3000 
aggtcgaaaa gccagcctgg aggaaataca gcttgttcat tctgaacatc actcactgtt 3060 
gtatggcacc aaccccctgg acggacagaa gctggacccc aggatactcc taggtgatga 3120 
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ctctcaaaag tttttttcct cattaccttg tggtggactt ggggtggaca gtgacaccat 3180 
ttggaatgag ctacactcgt ccggtgctgc acgcatggct gttggctgtg tcatcgagct 3240 
ggcttccaaa gtggcctcag gagagctgaa gaatgggttt gctgttgtga ggccccctgg 3300 
ccatcacgct gaagaatcca cagccatggg gttctgcttt tttaattcag ttgcaattac 3360 
cgccaaatac ttgagagacc aactaaatat aagcaagata ttgattgtag atctggatgt 3420 
tcaccatgga aacggtaccc agcaggcctt ttatgctgac cccagcatcc tgtacatttc 3480 
actccatcgc tatgatgaag ggaacttttt ccctggcagt ggagccccaa atgaggttgg 3540 
aacaggcctt ggagaagggt acaatataaa tattgcctgg acaggtggcc ttgatcctcc 3600 
catgggagat gttgagtacc ttgaagcatt caggaccatc gtgaagcctg tggccaaaga 3 660 
gtttgatcca gacatggtct tagtatctgc tggatttgat gcattggaag gccacacccc 3720 
tcctctagga gggtacaaag tgacggcaaa atgttttggt catttgacga agcaattgat 3780 
gacattggct gatggacgtg tggtgttggc tctagaagga ggacatgatc tcacagccat 3840 
ctgtgatgca tcagaagcct gtgtaaatgc ccttctagga aatgagctgg . agccacttgc 3900 
agaagatatt ctccaccaaa gcccgaatat gaatgctgtb atttctttac agaagatcat 3960 
tgaaattcaa agtatgtctt taaagttctc tggatccggt accagattac aaggacgacg 4020 
atgacaagta gatcccgggt ggcatccctg tgacccctcc ccagtgcctc tcctggcctt 4080 
ggaagttgcc actccagtgc cdaccagcct tgtcctaata aaattaagtt gcatcatttt 4140 
gtctgactag gtgtcctcta taatattatg gggtggaggg gggtggtatg gagcaagggg 4200 
cccaagttgg gaagacaacc tgtagggcct gcggggtcta ttcgggaacc aagctggagt 4260 
gcagtggcac aatcttggct cactgcaatc tccgcctcct gggttcaagc gattctcctg 4320 
cctcagcctc ccgagttgtt gggattccag gcatgcatga ccaggctcag ctaatttttg 4380 
tttttttggt agagacgggg tttcaccata ttggccaggc tggtctccaa ctcctaatct 4440 
caggtgatct acccaccttg gcctcccaaa ttgctgggat tacaggcgtg aaccactgct 4500 
cccttccctg tccttctgat tttaaaataa ctataccagc aggaggacgt ccagacacag 4560 
cataggctac ctgccatggc ccaaccggtg ggacatttga gttgcttgct tggcactgtc 4620 
ctctcatgcg ttgggtccac tcagtagatg cctgttgaat tgggtacgcg gccagcttct 4680 
gtggaatgtg tgtcagttag ggtgtggaaa gtccccaggc tccccagcag gcagaagtat 4740 
gcaaagcatg catctcaatt agtcagcaac caggtgtgga aaagtcccca ggctccccag 4800 
caggcagaag tatgcaaagc atgcatctca attagtcagc aaccatagtc ccgcccctaa 4860 
ctccgcccat cccgccccta actccgccca gttccgccca ttctccgccc catggctgac 4920 
taattttttt tatttatgca gaggccgagg ccgcctcggc ctctgagcta ttccagaagt 4980 
agtgaggagg cttttttgga ggcctaggct tttgcaaaaa gctcctcgag gaactgaaaa 5040 
accagaaagt taattcccta tagtgagtcg tattaaattc gtaatcatgg tcatagctgt 5100 
ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa 5160 
agtgtaaagc ctggggtgcc taatgagtga gctaactcac attaattgcg ttgcgctcac 5220 
tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg 5280 
cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc 5340 
gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat 5400 
ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca 5460 
ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc 5520 
atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc 5580 
aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg 5640 
gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc tcacgctgta 5700 
ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg 5760 
ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac 5820 
acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag 5880 
gcggtgctac agagttcttg aagtggtggc ctaacfcacgg ctacactaga agaacagtat 5940 
ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat 6000 
ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 6060 
gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt 6120 
ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct 6180 
agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt 6240 
ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc 6300 
gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac 6360 
catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat 6420 
cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg 6480 
cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata 6540 
gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta 6600 
tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc cccatgttgt 6660 
gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag 6720 
tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa 6780 
gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc 6840 
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gaccgagttg ctcttgcccg gcgtcaatac 
taaaagtgct catcattgga aaacgttctt 
tgttgagatc cagttcgatg taacccactc 
ctttcaccag cgtttctggg tgagcaaaaa 
taagggcgac acggaaatgt tgaatactca 
tttatcaggg ttattgtctc atgagcggat 
aaataggggt tccgcgcaca tttccccgaa 
cattaagcgc ggcgggtgtg gtggttacgc 
tagcgcccgc tcctttcgct ttcttccctt 
gtcaagctct aaatcggggc atccctttag 
accccaaaaa acttgattag ggtgatggtt 
tttttcgccc tttgacgttg gagtccacgt 
gaacaacact caaccctatc tcggtctatt 
cggcctattg gttaaaaaat gagctgattt 
tattaaacgt ttacaattt 



gggataatac cgcgccacat agcagaactt 6900 
cggggcgaaa actctcaagg atcttaccgc 6960 
gtgcacccaa ctgatcttca gcatctttta 7020 
caggaaggca aaatgccgca aaaaagggaa 7080 
tactcttcct ttttcaatat tattgaagca 7140 
acatatttga atgtatttag aaaaataaac 7200 
aagtgccacc tgacgcgccc tgtagcggcg 7260 
gcagcgtgac cgctacactt gccagcgccc 7320 
cctttctcgc cacgttcgcc ggctttcccc 7380 
ggttccgatt tagtgcttta cggcacctcg 7440 
cacgtagtgg gccatcgccc tgatagacgg 7500 
tctttaatag tggactcttg ttccaaactg 7560 
cttttgattt ataagggatt ttgccgattt 7 620 
aacaaaaatt taacgcgaat tttaacaaaa 7680 

7699 



<210> 15 

<211> 7303 

<212> DNA 

<213> Homo sapiens 

<400> 15 

cccattcgcc attcaggctg cgcaactgtt 
tattacgcca gctggcgaaa gggggatgtg 
gggttttccc agtcacgacg ttgtaaaacg 
ttggccatta gccatattat tcattggtta 
attgcatacg ttgtatccat atcataatat 
accgccatgt tgacattgat tattgactag 
agttcatagc ccatatatgg agttccgcgt 
cgaccgccca gcgacccccg cccgttgacg 
ccaataggga ctttccattg acgtcaatgg 
gcagtacatc aagtgtatca tatgccaagt 
tggcccgcct agcattatgc ccagtacatg 
atctacgtat tagtcatcgc tattaccatg 
cgtggatagc ggtttgactc acggggattt 
agtttgtttt ggcaccaaaa tcaacgggac 
ttgacgcaaa tgggcggtag gcgtgtacgg 
gtgaaccgtc agaattcaag cttgcggccg 
gcacagtatg atcagctcag tggatgtgaa 
ctcaccttta gacctaagga cagacctcag 
ccgtgagaag caattgcagc aggaattact 
gcagcttctg atagcagagt ttcagaaaca 
tcagcttcag gagcatatca aggaacttct 
aaaggagcag aaactggagc agcagaggca 
acagcagctt cctcctctca gaggcaaaga 
agaagtaaag cagaagcttc aagagttcct 
aactaatgga aaaaatcatt ccgtgagccg 
ccacacatca ttggatcaaa gctctccacc 
cacattacca ggagcacaag atgcaaagga 
gcccaacttg aaggtgcggt ccaggttaaa 
cttactcagg cggaaggatg gaaatgttgt 
gacagaatcc tcagtcagta gcagttctcc 
gccaactgga agtgttactg aaaatgagac 
gcaaatggtt tcacagcaac gcattctaat 
ttatacctct ccttctttgc ccaacattac 
caatgcttcg aattcactca aagaaaagca 
tgttcctctg cctgggcagt atggaggcag 
tactttagag ggaaagccac ccaacagcag 
gaaagaacaa atgcgacagc aaaagcttct 
gtctcccttg gcaacaaaag agagaatttc 



gggaagggcg atcggtgcgg gcctcttcgc 60 
ctgcaaggcg attaagttgg gtaacgccca 120 
acggccagtg ccaagctgat ctaatcaata 180 
tatagcataa atcaatattg gctattggcc 240 
gtacatttat attggctcat gtccaacatt 300 
ttattaatag taatcaatta cggggtcatt 360 
tacataactt acggtaaatg gcccgcctgg 420 
tcaatagtga cgtatgttcc catagtaacg 480 
gtggagtatt tacggtaaac tgcccacttg 540 
ccgcccccta ttgacgtcaa tgacggtaaa 600 
accttacggg agtttcctac ttggcagtac 660 
gtgatgcggt tttggcagta caccaatggg 720 
ccaagtctcc accccattga cgtcaatggg 780 
tttccaaaat gtcgtaataa ccccgccccg 840 
tgggaggtct atataagcag agctcgttta 900 
cagatctatc gatctgcagg atatcaccat 960 
gtcagaagtt cctgtgggcc tggagcccat 1020 
gatgatgatg cccgtggtgg accctgttgt 1080 
tcttatccag cagcagcaac aaatccagaa 1140 
gcatgagaac ttgacacggc agcaccaggc 1200 
agccataaaa cagcaacaag aactcctaga 1260 
agaacaggaa gtagagaggc atcgcagaga 1320 
tagaggacga gaaagggcag tggcaagtac 1380 
actgagtaaa tcagcaacga aagacactcc 1440 
ccatcccaag ctctggtaca cggctgccca 1500 
ccttagtgga acatctccat cctacaagta 1560 
tgatttcccc cttcgaaaaa ctgcctctga 1620 
acagaaagtg gcagagagga gaagcagccc 1680 
cacttcattc aagaagcgaa tgtttgaggt 1740 
aggctctggt cccagttcac caaacaatgg 1800 
ttcggttttg ccccctaccc ctcatgccga 1860 
tcatgaagat tccatgaacc tgctaagtct 1920 
cttggggctt cccgcagtgc catcccagct 1980 
gaagtgtgag acgcagacgc ttaggcaagg 2040 
catcccggca tcttccagcc accctcatgt 2100 
ccaccaggct ctcctgcagc atttattatt 2160 
tgtagctggt ggagttccct tacatcctca 2220 
acctggcatt agaggtaccc acaaattgcc 2280 
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ccgtcacaga cccctgaacc gaacccagtc 
gctggtcatt caacagcaac accagcaatt 
gatccacatg aacaaactgc tttcgaaatc 
ccttgaggaa gcagaggaag agcttcaggg 
ctctagtggc aacagcacta ggagcgacag 
agttggggct gtgaaggtca aggaggaacc 
ggaaatggaa tctggggagc aggctgcttt 
cacacgtgcg ctctctgtgc gccaagctcc 
gaaacaccgt ctcgtctcca ggactcactc 
agcaatggac cgccccctcc agcctggctc 
gctgaaacac cagtgcgttt gtggcaattc 
acagagtatc tggtcacgac tgcaagaaac 
aggtcgaaaa gccagcctgg aggaaataca 
gtatggcacc aaccccctgg acggacagaa 
ctctcaaaag tttttttcct cattaccttg 
ttggaatgag ctacactcgt ccggtgctgc 
ggcttccaaa gtggcctcag gagagctgaa 
ccatcacgct gaagaatcca cagccatggg 
cgccaaatac ttgagagacc aactaaatat 
tcaccatgga aacggtaccc agcaggcctt 
actccatcgc tatgatgaag ggaacttttt 
gtttatttct ttagagcccc acttttattt 
cggtaccaga ttacaaggac gacgatgaca 
ctccccagtg cctctcctgg ccttggaagt 
aataaaatta agttgcatca ttttgtctga 
aggggggtgg tatggagcaa ggggcccaag 
tctattcggg aaccaagctg gagtgcagtg 
tcctgggttc aagcgattct cctgcctcag 
atgaccaggc tcagctaatt tttgtttttt 
aggctggtct ccaactccta atctcaggtg 
ggattacagg cgtgaaccac tgctcccttc 
cagcaggagg acgtccagac acagcatagg 
ttgagttgct tgcttggcac tgtcctctca 
gaattgggta cgcggccagc ttctgtggaa 
aggctcccca gcaggcagaa gtatgcaaag 
tggaaaagtc cccaggctcc ccagcaggca 
cagcaaccat agtcccgccc ctaactccgc 
cccattctcc gccccatggc tgactaattt 
cggcctctga gctattccag aagtagtgag 
aaaagctcct cgaggaactg aaaaaccaga 
attcgtaatc atggtcatag ctgtttcctg 
acaacatacg agccggaagc ataaagtgta 
tcacattaat tgcgttgcgc tcactgcccg 
tgcattaatg aatcggccaa cgcgcgggga 
cttcctcgct cactgactcg ctgcgctcgg 
actcaaaggc ggtaatacgg ttatccacag 
gagcaaaagg ccagcaaaag gccaggaacc 
ataggctccg cccccctgac gagcatcaca 
acccgacagg actataaaga taccaggcgt 
ctgttccgac cctgccgctt accggatacc 
cgctttctca atgctcacgc tgtaggtatc 
tgggctgtgt gcacgaaccc cccgttcagc 
gtcttgagtc caacccggta agacacgact 
ggattagcag agcgaggtat gtaggcggtg 
acggctacac tagaagaaca gtatttggta 
gaaaaagagt tggtagctct tgatccggca 
ttgtttgcaa gcagcagatt acgcgcagaa 
tttctacggg gtctgacgct cagtggaacg 
gattatcaaa aaggatcttc acctagatcc 
tctaaagtat atatgagtaa acttggtctg 
ctatctcagc gatctgtcta tttcgttcat 
taactacgat acgggagggc ttaccatctg 



tgcacctttg cctcagagca cgttggctca 2340 
cttggagaag cagaagcaat accagcagca 2400 
tattgaacaa ctgaagcaac caggcagtca 2460 
ggaccaggcg atgcaggaag acagagcgcc 2520 
cagtgcttgt gtggatgaca cactgggaca 2580 
agtggacagt gatgaagatg ctcagatcca 2 640 
tatgcaacag cctttcctgg aacccacgca 2700 
gctggctgcg gttggcatgg atggattaga 2760 
ttcccctgct gcctctgttt tacctcaccc 2820 
tgcaactgga attgcctatg accccttgat 2880 
caccacccac cctgagcatg ctggacgaat 2940 
tgggctgcta aataaatgtg agcgaattca 3000 
gcttgttcat tctgaacatc actcactgtt 3060 
gctggacccc aggatactcc taggtgatga 3120 
tggtggactt ggggtggaca gtgacaccat 3180 
acgcatggct gttggctgtg tcatcgagct 3240 
gaatgggttt gctgttgtga ggccccctgg 3300 
gttctgcttt tttaattcag ttgcaattac 3360 
aagcaagata ttgattgtag atctggatgt 3420 
ttatgctgac cccagcatcc tgtacatttc 3480 
ccctggcagt ggagccccaa atgaggttcg 3540 
gtatctttca ggtaattgca ttgcaggatc 3 600 
agtagatccc gggtggcatc cctgtgaccc 3 660 
tgccactcca gtgcccacca gccttgtcct 3720 
ctaggtgtcc tctataatat tatggggtgg 3780 
ttgggaagac aacctgtagg gcctgcgggg 3 840 
gcacaatctt ggctcactgc aatctccgcc 3900 
cctcccgagt tgttgggatt ccaggcatgc 3960 
tggtagagac ggggtttcac catattggcc 4020 
atctacccac cttggcctcc caaattgctg 4080 
cctgtccttc tgattttaaa ataactatac 4140 
ctacctgcca tggcccaacc ggtgggacat 4200 
tgcgttgggt ccactcagta gatgcctgtt 4260 
tgtgtgtcag ttagggtgtg gaaagtcccc 4320 
catgcatctc aattagtcag caaccaggtg 4380 
gaagtatgca aagcatgcat ctcaattagt 4440 
ccatcccgcc cctaactccg cccagttccg 4500 
tttttattta tgcagaggcc gaggccgcct 4560 
gaggcttttt tggaggccta ggcttttgca 4620 
aagttaattc cctatagtga gtcgtattaa 4680 
tgtgaaattg ttatccgctc acaattccac 4740 
aagcctgggg tgcctaatga gtgagctaac 4800 
ctttccagtc gggaaacctg tcgtgccagc 4860 
gaggcggttt gcgtattggg cgctcttccg 4920 
tcgttcggct gcggcgagcg gtatcagctc 4980 
aatcagggga taacgcagga aagaacatgt 5040 
gtaaaaaggc cgcgttgctg gcgtttttcc 5100 
aaaatcgacg ctcaagtcag aggtggcgaa 5160 
ttccccctgg aagctccctc gtgcgctctc 5220 
tgtccgcctt tctcccttcg ggaagcgtgg 5280 
tcagttcggt gtaggtcgtt cgctccaagc 5340 
ccgaccgctg cgccttatcc ggtaactatc 5400 
tatcgccact ggcagcagcc actggtaaca 5460 
ctacagagtt cttgaagtgg tggcctaact 5520 
tctgcgctct gctgaagcca gttaccttcg 5580 
aacaaaccac cgctggtagc ggtggttttt 5640 
aaaaaggatc tcaagaagat cctttgatct 5700 
aaaactcacg ttaagggatt ttggtcatga 5760 
ttttaaatta aaaatgaagt tttaaatcaa 5820 
acagttacca atgcttaatc agtgaggcac 5880 
ccatagttgc ctgactcccc gtcgtgtaga 5940 
gccccagtgc tgcaatgata ccgcgagacc 6000 
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cacgctcacc ggctccagat ttatcagcaa 
gaagtggtcc tgcaacttta tccgcctcca 
gagtaagtag ttcgccagtt aatagtttgc 
tggtgtcacg ctcgtcgttt ggtatggctt 
gagttacatg atcccccatg ttgtgcaaaa 
ttgtcagaag taagttggcc gcagtgttat 
ctcttactgt catgccatcc gtaagatgct 
cattctgaga atagtgtatg cggcgaccga 
ataccgcgcc acatagcaga actttaaaag 
gaaaactctc aaggatctta ccgctgttga 
ccaactgatc ttcagcatct tttactttca 
ggcaaaatgc cgcaaaaaag ggaataaggg 
tcctttttca atattattga agcatttatc 
ttgaatgtat ttagaaaaat aaacaaatag 
cacctgacgc gccctgtagc ggcgcattaa 
tgaccgctac acttgccagc gccctagcgc 
tcgccacgtt cgccggcttt ccccgtcaag 
gatttagtgc tttacggcac ctcgacccca 
gtgggccatc gccctgatag acggtttttc 
atagtggact cttgttccaa actggaacaa 
atttataagg gattttgccg atttcggcct 
aatttaacgc gaattttaac aaaatattaa 



taaaccagcc agccggaagg gccgagcgca 6060 
tccagtctat taattgttgc cgggaagcta 6120 
gcaacgttgt tgccattgct acaggcatcg 6180 
cattcagctc cggttcccaa cgatcaaggc 6240 
aagcggttag ctccttcggt cctccgatcg 63 00 
cactcatggt tatggcagca ctgcataatt 63 60 
tttctgtgac tggtgagtac tcaaccaagt 6420 
gttgctcttg cccggcgtca atacgggata 6480 
tgctcatcat tggaaaacgt tcttcggggc 6540 
gatccagttc gatgtaaccc actcgtgcac 6600 
ccagcgtttc tgggtgagca aaaacaggaa 6660 
cgacacggaa atgttgaata ctcatactct 6720 
agggttattg tctcatgagc ggatacatat 6780 
gggttccgcg cacatttccc cgaaaagtgc 6840 
gcgcggcggg tgtggtggtt acgcgcagcg 6900 
ccgctccttt cgctttcttc ccttcctttc 6960 
ctctaaatcg gggcatccct ttagggttcc 7020 
aaaaacttga ttagggtgat ggttcacgta 7080 
gccctttgac gttggagtcc acgttcttta 7140 
cactcaaccc tatctcggtc tattcttttg 7200 
attggttaaa aaatgagctg atttaacaaa 7260 
acgtttacaa ttt 7303 



<210> 16 

<211> 24 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Primer used to amplify human DNA 
<400> 16 

ccatggaaac ggtacccagc aggc 

<210> 17 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 
<400> 17 

cactccatcg ctatgatgaa ggg 

<210> 18 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 



<400> 18 

agttcccttc atcatagcga tgg 23 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Primer used to amplify human DNA 



<400> 19 

aatgtacagg atgctggggt 



20 



<210> 20 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer, used to amplify human DNA 
<400> 20 

cccttgtagc tggtggagtt ccctt 25 

<210> 21 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 



<210> 22 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA / 



<400> 21 

tgtgtcatcg agctggcttc 



20 



<400> 22 

atcttctgca agtggctcca 



20 
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Thr 


His 


Ser 


Ser Pro 


Ala Ala Ser Val Leu 


Pro 


His 


Pro 


Ala 


Met 


Asp 




610 






615 




620 










Arg 


Pro 


Leu 


Gin Pro 


Gly Ser Ala Thr Gly 


He 


Ala 


Tyr 


Asp 


Pro 


Leu 


625 








630 


635 










640 


Met 


Leu 


Lys 


His Gin 


Cys Val Cys Gly Asn 


Ser 


Thr 


Thr 


His 


Pro 


Glu 






645 


650 










655 




His 


Ala 


Gly 


Arg He 


Gin Ser He Trp Ser 


Arg 


Leu 


Gin 


Glu 


Thr 


Gly 








660 


665 








670 






Leu 


Leu 


Asn 


Lys Cys 


Glu Arg He Gin Gly 


Arg 


Lys 


Ala 


Ser 


Leu 


Glu 






675 




680 






685 








Glu 


He 


Gin 


Leu Val 


His Ser Glu His His 


Ser 


Leu 


Leu 


Tyr 


Gly 


Thr 




690 






695 




700 










Asn 


Pro 


Leu 


Asp Gly 


Gin Lys Leu Asp Pro 


Arg 


He 


Leu 


Leu 


Gly 


Asp 


705 








710 


715 










720 


Asp 


Ser 


Gin 


Lys Phe 


Phe Ser Ser Leu Pro 


Cys 


Gly 


Gly 


Leu 


Gly 


Val 






725 


730 










735 




Asp 


Ser 


Asp 


Thr He 


Trp Asn Glu Leu His 


Ser 


Ser 


Gly 


Ala 


Ala 


Arg 






740 


745 








750 






Met 


Ala 


Val 


Gly Cys 


Val He Glu Leu Ala 


Ser 


Lys 


Val 


Ala 


Ser 


Gly 






755 


760 






765 








Glu 


Leu 


Lys 


Asn Gly 


Phe Ala Val Val Arg 


Pro 


Pro 


Gly 


His 


His 


Ala 




770 






775 




780 










Glu 


Glu 


Ser 


Thr Ala 


Met Gly Phe Cys Phe 


Phe 


Asn 


Ser 


Val 


Ala 


He 


785 








790 


795 










800 


Thr 


Ala 


Lys 


Tyr Leu 


Arg Asp Gin Leu Asn 


He 


Ser 


Lys 


He 


Leu 


He 






805 


810 










815 




Val 


Asp 


Leu 


Asp Val 


His His Gly Asn Gly 


Thr 


Gin 


Gin 


Ala 


Phe 


Tyr 






820 


825 








830 






Ala 


Asp 


Pro 


Ser He 


Leu Tyr He Ser Leu 


His 


Arg 


Tyr 


Asp 


Glu 


Gly 






835 




840 






845 








Asn 


Phe 


Phe 


Pro Gly 


Ser Gly Ala Pro Asn 


Glu 


Val 


Arg 


Phe 


He 


Ser 




850 






855 




860 










Leu 


Glu 


Pro 


His Phe 


Tyr Leu Tyr Leu Ser 


Gly 


Asn 


Cys 


He 


Ala 





865 870 875 



<210> 5 

<211> 3054 

<212> DNA 

<213> Homo sapiens 

<400> 5 

ggggaagaga ggcacagaca 
tgagggtttt tgcaacaaaa 
ggacgagagc agctcttggc 
aagtcagaag ttcctgtggg 
aggatgatga tgcccgtggt 
cttcttatcc agcagcagca 
cagcatgaga acttgacacg 
ctagccataa aacagcaaca 
caagaacagg aagtagagag 
gatagaggac gagaaagggc 
ctactgagta aatcagcaac 
cgccatccca agctctggta 
ccccttagtg gaacatctcc 
gatgatttcc cccttcgaaa 
cccagttcac caaacaatgg 
ccccctaccc ctcatgccga 
tccatgaacc tgctaagtct 
cccgcagtgc catcccagct 
acgcagacgc ttaggcaagg 
tcttccagcc accctcatgt 



cagataggag aagggcaccg 
ccctagcagc ctgaagaact 
tcagcaaaga atgcacagta 
cctggagccc atctcacctt 
ggaccctgtt gtccgtgaga 
acaaatccag aagcagcttc 
gcagcaccag gctcagcttc 
agaactccta gaaaaggagc 
gcatcgcaga gaacagcagc 
agtggcaagt acagaagtaa 
gaaagacact ccaactaatg 
cacggctgcc caccacacat 
atcctacaag tacacattac 
aactgaatcc tcagtcagta 
gccaactgga agtgttactg 
gcaaatggtt tcacagcaac 
ttatacctct ccttctttgc 
caatgcttcg aattcactca 
tgttcctctg cctgggcagt 
tactttagag ggaaagccac 



gctggagcca cttgcaggac 60 
ctaagccaga tggggtggct 120 
tgatcagctc agtggatgtg 180 
tagacctaag gacagacctc 240 
agcaattgca gcaggaatta 300 
tgatagcaga gtttcagaaa 360 
aggagcatat caaggaactt 420 
agaaactgga gcagcagagg 480 
ttcctcctct cagaggcaaa 540 
agcagaagct tcaagagttc 600 
gaaaaaatca ttccgtgagc 660 
cattggatca aagctctcca 720 
caggagcaca agatgcaaag 780 
gcagttctcc aggctctggt 840 
aaaatgagac ttcggttttg 900 
gcattctaat tcatgaagat 960 
ccaacattac cttggggctt 1020 
aagaaaagca gaagtgtgag 1080 
atggaggcag catcccggca 1140 
ccaacagcag ccaccaggct 1200 
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ctcctgcagc atttattatt gaaagaacaa atgcgacagc aaaagcttct tgtagctggt 1260 
ggagttccct tacatcctca gtctcccttg gcaacaaaag agagaatttc acctggcatt 1320 
agaggtaccc acaaattgcc ccgtcacaga cccctgaacc gaacccagtc tgcacctttg 1380 
cctcagagca cgttggctca gctggtcatt caacagcaac accagcaatt cttggagaag 1440 
cagaagcaat accagcagca gatccacatg aacaaactgc tttcgaaatc tattgaacaa 1500 
ctgaagcaac caggcagtca ccttgaggaa gcagaggaag agcttcaggg ggaccaggcg 1560 
atgcaggaag acagagcgcc ctctagtggc aacagcacta ggagcgacag cagtgcttgt 1620 
gtggatgaca cactgggaca agttggggct gtgaaggtca aggaggaacc agtggacagt 1680 
gatgaagatg ctcagatcca ggaaatggaa tctggggagc aggctgcttt tatgcaacag 1740 
cctttcctgg aacccacgca cacacgtgcg ctctctgtgc gccaagctcc gctggctgcg 1800 
gttggcatgg atggattaga gaaacaccgt ctcgtctcca ggactcactc ttcccctgct 1860 
gcctctgttt tacctcaccc agcaatggac cgccccctcc agcctggctc tgcaactgga 1920 
attgcctatg accccttgat gctgaaacac cagtgcgttt gtggcaattc caccacccac 1980 
cctgagcatg ctggacgaat acagagtatc tggtcacgac tgcaagaaac tgggctgcta 2040 
aataaatgtg agcgaattca aggtcgaaaa gccagcctgg aggaaataca gcttgttcat 2100 
tctgaacatc actcactgtt gtatggcacc aaccccctgg acggacagaa gctggacccc 2160 
aggatactcc taggtgatga ctctcaaaag tttttttcct cattaccttg tggtggactt 2220 
ggggtggaca gtgacaccat ttggaatgag ctacactcgt ccggtgctgc acgcatggct 2280 
gttggctgtg tcatcgagct ggcttccaaa gtggcctcag gagagctgaa gaatgggttt 2340 
gctgttgtga ggccccctgg ccatcacgct gaagaatcca cagccatggg gttctgcttt 2400 
tttaattcag ttgcaattac cgccaaatac ttgagagacc aactaaatat aagcaagata 2460 
ttgattgtag atctggatgt tcaccatgga aacggtaccc agcaggcctt ttatgctgac 2520 
cccagcatcc tgtacatttc actccatcgc tatgatgaag ggaacttttt ccctggcagt 2580 
ggagccccaa atgaggttgg aacaggcctt ggagaagggt acaatataaa tattgcctgg 2640 
acaggtggcc ttgatcctcc catgggagat gttgagtacc ttgaagcatt caggaccatc 2700 
gtgaagcctg tggccaaaga gtttgatcca gacatggtct tagtatctgc tggatttgat 2760 
gcattggaag gccacacccc tcctctagga gggtacaaag tgacggcaaa atgttttggt 2820 
catttgacga agcaattgat gacattggct gatggacgtg tggtgttggc tctagaagga 2880 
ggacatgatc tcacagccat ctgtgatgca tcagaagcct gtgtaaatgc ccttctagga 2940 
aatgagctgg agccacttgc agaagatatt ctccaccaaa gcccgaatat gaatgctgtt 3000 
atttctttac agaagatcat tgaaattcaa agtatgtctt taaagttctc ttaa 3054 

<210> 6 
<211> 967 
<212> PRT 

<213> Homo sapiens 
<400> 6 



Met 


Hi's 


Ser 


Met 


lie 


Ser 


Ser 


Val 


Asp 


Val 


Lys 


Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 










15 




Gly 


Leu 


Glu 


Pro 
20 


lie 


Ser 


Pro 


Leu 


Asp 
25 


Leu 


Arg 


Thr 


Asp 


Leu 
30 


Arg 


Met 


Met 


Met 


Pro 
35 


Val 


Val 


Asp 


Pro 


Val 
40 


Val 


Arg 


Glu 


Lys 


Gin 
45 


Leu 


Gin 


Gin 


Glu 


Leu 
50 


Leu 


Leu 


He 


Gin 


Gin 
55 


Gin 


Gin 


Gin 


He 


Gin 
60 


Lys 


Gin 


Leu 


Leu 


lie 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu 


Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 










80 


Ala 


Gin 


Leu 


Gin 


Glu 
85 


His 


He 


Lys 


Glu 


Leu 
90 


Leu 


Ala 


He 


Lys 


Gin 
95 


Gin 


Gin 


Glu 


Leu 


Leu 
100 


Glu 


Lys 


Glu 


Gin 


Lys 
105 


Leu 


Glu 


Gin 


Gin 


Arg 
110 


Gin 


Glu 


Gin 


Glu 


Val 
115 


Glu 


Arg 


His 


Arg 


Arg 
120 


Glu 


Gin 


Gin 


Leu 


Pro 
125 


Pro 


Leu 


Arg 


Gly 


Lys 
130 


Asp 


Arg 


Gly 


Arg 


Glu 
135 


Arg 


Ala 


Val 


Ala 


Ser 
140 


Thr 


Glu 


Val 


Lys 


Gin 


Lys 


Leu 


Gin 


Glu 


Phe 


Leu 


Leu 


Ser 


Lys 


Ser 


Ala 


Thr 


Lys 


Asp 


Thr 


145 










150 










155 










160 


Pro 


Thr 


Asn 


Gly 


Lys 
165 


Asn 


His 


Ser 


Val 


Ser 
170 


Arg 


His 


Pro 


Lys 


Leu 
175 


Trp 


Tyr 


Thr 


Ala 


Ala 
180 


His 


His 


Thr 


Ser 


Leu 
185 


Asp 


Gin 


Ser 


Ser 


Pro 
190 


Pro 


Leu 
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Ser 


Gly 


Thr 


Ser 






195 




Ala 


Lys 


Asp 


Asp 




210 






Ser 


Ser 


Pro 


Gly 


Z ZD 








Ser 


Val 


Thr 


Glu 


Glu 


Gin 


Met 


Val 








260 


Asn 


Leu 


Leu 


Ser 






275 




Gly 


Leu 


Pro 


Ala 




290 






Glu 


Lys 


Gin 


Lys 


one 

305 








Pro 


Gly 


Gin 


Tyr 


Val 


Thr 


Leu 


Glu 








340 


Gin 


His 


Leu 


Leu 






355 




Ala 


Gly 


Gly 


Val 




370 






Arg 


He 


Ser 


Pro 


O O C 

ODD 








Pro 


Leu 


Asn 


Arg 


Gin 


Leu 


Val 


He 








420 


Gin 


Tyr 


Gin 


Gin 






435 




Glu 


Gin 


Leu 


Lys 




450 






Leu 


Gin 


Gly 


Asp 


4bD 








Asn 


Ser 


Thr 


Arg 


Gin 


Val 


Gly 


Ala 








500 


Asp 


Ala 


Gin 


He 






515 




Gin 


Gin 


Pro 


Phe 




530 






Gin 


Ala 


Pro 


Leu 


DflD 








Leu 


Val 


Ser 


Arg 


Pro 


Ala 


Met 


Asp 








580 


Tyr 


Asp 


Pro 


Leu 






595 




Thr 


His 


Pro 


Glu 




610 






Gin 


Glu 


Thr 


Gly 


625 








Ala 


Ser 


Leu 


Glu 


Leu 


Tyr 


Gly 


Thr 








660 


Leu 


Leu 


Gly 


Asp 






675 





Pro Ser Tyr Lys 
200 

Phe Pro Leu Arg 
215 

Ser Gly Pro Ser 
230 

Asn Glu Thr Ser 
245 

Ser Gin Gin Arg 

Leu Tyr Thr Ser 
280 

Val Pro Ser Gin 
295 

Cys Glu Thr Gin 
310 

Gly Gly Ser He 
325 

Gly Lys Pro Pro 

Leu Lys Glu Gin 
360 

Pro Leu His Pro 
375 

Gly He Arg Gly 
390 

Thr Gin Ser Ala 
405 

Gin Gin Gin His 

Gin He His Met 
440 

Gin Pro Gly Ser 
455 

Gin Ala Met Gin 
470 

Ser Asp Ser Ser 
485 

Val Lys Val Lys 

Gin Glu Met Glu 
520 

Leu Glu Pro Thr 
535 

Ala Ala Val Gly 
550 

Thr His Ser Ser 
565 

Arg Pro Leu Gin 

Met Leu Lys His 
600 

His Ala Gly Arg 
615 

Leu Leu Asn Lys 
630 

Glu He Gin Leu 
645 

Asn Pro Leu Asp 

Asp Ser Gin Lys 
680 
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Tyr 


Thr 


Leu 


Pro 


Lys 


Thr 


Glu 


Ser 








220 


Ser 


Pro 


Asn 


Asn 






235 




Val 


Leu 


Pro 


Pro 




250 






He 


Leu 


He 


His 


265 








Pro 


Ser 


Leu 


Pro 


Leu 


Asn 


Ala 


Ser 








300 


Thr 


Leu 


Arg 


Gin 






315 




Pro 


Ala 


Ser 


Ser 




330 






Asn 


Ser 


Ser 


His 


345 








Met Arg 


Gin 


Gin 


Gin 


Ser 


Pro 


Leu 








380 


Thr 


His 


Lys 


Leu 






395 




Pro 


Leu 


Pro 


Gin 




410 






Gin 


Gin 


Phe 


Leu 


425 








Asn 


Lys 


Leu 


Leu 


His 


Leu 


Glu 


Glu 








460 


Glu Asp 


Arg 


Ala 






475 




Ala Cys 


Val 


Asp 




490 






Glu 


Glu 


Pro 


Val 


505 








Ser Gly 


Glu 


Gin 


His 


Thr 


Arg 


Ala 








540 


Met 


Asp 


Gly 


Leu 






555 




Pro 


Ala 


Ala 


Ser 




570 






Pro Gly 


Ser 


Ala 


585 








Gin Cys 


Val 


Cys 


He 


Gin 


Ser 


He 








620 


Cys 


Glu 


Arg 


He 






635 




Val 


His 


Ser 


Glu 




650 






Gly Gin 


Lys 


Leu 


665 








Phe 


Phe 


Ser 


Ser 



Gly 


Ala 


Gin 


Asp 


205 








Ser 


Val 


Ser 


Ser 


Gly 


Pro 


Thr 


Gly 








240 


Thr 


Pro 


His 


Ala 






255 




Glu 


Asp 


Ser 


Met 




270 






Asn 


He 


Thr 


Leu 


285 








Asn 


Ser 


Leu 


Lys 


Gly 


Val 


Pro 


Leu 








320 


Ser 


His 


Pro 


His 






335 




Gin 


Ala 


Leu 


Leu 




350 






Lys 


Leu 


Leu 


Val 










Ala 


Thr 


Lys 


Glu 


Pro 


Arg 


His 


Arg 








400 


Ser 


Thr 


Leu 


Ala 






415 




Glu 


Lys 


Gin 


Lys 




430 






Ser 


Lys 


Ser 


He 










Ala 


Glu 


Glu 


Glu 


Pro 


Ser 


Ser 


Gly 








480 


Asp 


Thr 


Leu 


Gly 






495 




Asp 


Ser 


Asp 


Glu 




510 






Ala 


Ala 


Phe 


Met 


525 








Leu 


Ser 


Val 


Arg 


Glu 


Lys 


His 


Arg 








560 


Val 


Leu 


Pro 


His 






575 




Thr 


Gly 


He 


Ala 




590 






Gly 


Asn 


Ser 


Thr 


605 








Trp 


Ser 


Arg 


Leu 


Gin 


Gly 


Arg 


Lys 








640 


His 


His 


Ser 


Leu 






655 




Asp 


Pro 


Arg 


He 




670 






Leu 


Pro 


Cys 


Gly 


685 
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Gly 


Leu 


Gly 


Val 


Asp Ser Asp Thr He Trp 


Asn 


Glu 


Leu 


His 


Ser 


Ser 




690 






695 




700 










Gly 


Ala 


Ala 


Arg 


Met Ala Val Gly Cys Val 


He 


Glu 


Leu 


Ala 


Ser 


Lys 


705 








710 


715 










720 


Val 


Ala 


Ser 


Gly 


Glu Leu Lys Asn Gly Phe 


Ala 


Val 


Val 


Arg 


Pro 


Pro 










725 730 










735 




Gly 


His 


His 


Ala 


Glu Glu Ser Thr Ala Met 


Gly 


Phe 


Cys 


Phe 


Phe 


Asn 








740 


745 








750 






Ser 


Val 


Ala 


lie 


Thr Ala Lys Tyr Leu Arg 


Asp 


Gin 


Leu 


Asn 


He 


Ser 






755 




760 






765 








Lys 


He 


Leu 


He 


Val Asp Leu Asp Val His 


His 


Gly Asn 


Gly Thr 


Gin 




770 






775 




780 










Gin 


Ala 


Phe 


Tyr 


Ala Asp Pro Ser He Leu 


Tyr 


He 


Ser 


Leu 


His 


Arg 


785 








790 


795 










800 


Tyr 


Asp 


Glu 


Gly 


Asn Phe Phe Pro Gly Ser 


Gly Ala 


Pro 


Asn 


Glu 


Val 










805 810 










815 




Gly 


Thr 


Gly 


Leu 


Gly Glu Gly Tyr Asn He 


Asn 


He 


Ala 


Trp Thr Gly 








820 


825 








830 






Gly 


Leu 


Asp 


Pro 


Pro Met Gly Asp Val Glu 


Tyr Leu Glu 


Ala 


Phe 


Arg 






835 




840 






845 








Thr 


He 


Val 


Lys 


Pro Val Ala Lys Glu Phe 


Asp 


Pro Asp 


Met 


Val 


Leu 




850 






855 




860 










Val 


Ser 


Ala 


Gly 


Phe Asp Ala Leu Glu Gly 


His 


Thr 


Pro 


Pro Leu Gly 


865 








870 


875 










880 


Gly 


Tyr 


Lys 


Val 


Thr Ala Lys Cys Phe Gly 


His 


Leu 


Thr 


Lys 


Gin 


Leu 










885 890 










895 




Met 


Thr 


Leu 


Ala 


Asp Gly Arg Val Val Leu 


Ala 


Leu 


Glu 


Gly Gly His 








900 


905 








910 






Asp 


Leu 


Thr 


Ala 


He Cys Asp Ala Ser Glu 


Ala Cys Val 


Asn 


Ala 


Leu 






915 




920 






925 








Leu 


Gly 


Asn 


Glu 


Leu Glu Pro Leu Ala Glu 


Asp 


He 


Leu 


His 


Gin 


Ser 




930 






935 




940 










Pro 


Asn 


Met 


Asn 


Ala Val He Ser Leu Gin 


Lys 


He 


He 


Glu 


He 


Gin 


945 








950 


955 










960 


Ser 


Met 


Ser 


Leu 


Lys Phe Ser 















965 



<210> 7 
<211> 3367 
<212> DNA 

<213> Homo sapiens 
<400> 7 

ggggaagaga ggcacagaca cagataggag aagggcaccg gctggagcca cttgcaggac 60 

tgagggtttt tgcaacaaaa ccctagcagc ctgaagaact ctaagccaga tggggtggct 120 

ggacgagagc agctcttggc tcagcaaaga atgcacagta tgatcagctc agtggatgtg 180 

aagtcagaag ttcctgtggg cctggagccc atctcacctt tagacctaag gacagacctc 240 

aggatgatga tgcccgtggt ggaccctgtt gtccgtgaga agcaattgca gcaggaatta 3 00 

cttcttatcc agcagcagca acaaatccag aagcagcttc tgatagcaga gtttcagaaa 3 60 

cagcatgaga acttgacacg gcagcaccag gctcagcttc aggagcatat caaggaactt 420 

ctagccataa aacagcaaca agaactccta gaaaaggagc agaaactgga gcagcagagg 480 

caagaacagg aagtagagag gcatcgcaga gaacagcagc ttcctcctct cagaggcaaa 540 

gatagaggac gagaaagggc agtggcaagt acagaagtaa agcagaagct tcaagagttc 600 

ctactgagta aatcagcaac gaaagacact ccaactaatg gaaaaaatca ttccgtgagc 660 

cgccatccca agctctggta cacggctgcc caccacacat cattggatca aagctctcca 720 

ccccttagtg gaacatctcc atcctacaag tacacattac caggagcaca agatgcaaag 780 

gatgatttcc cccttcgaaa aactgaatcc tcagtcagta gcagttctcc aggctctggt 840 

cccagttcac caaacaatgg gccaactgga agtgttactg aaaatgagac ttcggttttg 900 

ccccctaccc ctcatgccga gcaaatggtt tcacagcaac gcattctaat tcatgaagat 960 

tccatgaacc tgctaagtct ttatacctct ccttctttgc ccaacattac cttggggctt 1020 

cccgcagtgc catcccagct caatgcttcg aattcactca aagaaaagca gaagtgtgag 1080 
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acgcagacgc ttaggcaagg tgttcctctg 
tcttccagcc accctcatgt tactttagag 
ctcctgcagc atttattatt gaaagaacaa 
ggagttccct tacatcctca gtctcccttg 
agaggtaccc acaaattgcc ccgtcacaga 
cctcagagca cgttggctca gctggtcatt 
cagaagcaat accagcagca gatccacatg 
ctgaagcaac caggcagtca ccttgaggaa 
atgcaggaag acagagcgcc ctctagtggc 
gtggatgaca cactgggaca agttggggct 
gatgaagatg ctcagatcca ggaaatggaa 
cctttcctgg aacccacgca cacacgtgcg 
gttggcatgg atggattaga gaaacaccgt 
gcctctgttt tacctcaccc agcaatggac 
attgcctatg accccttgat gctgaaacac 
cctgagcatg ctggacgaat acagagtatc 
aataaatgtg agcgaattca aggtcgaaaa 
tctgaacatc actcactgtt gtatggcacc 
aggatactcc taggtgatga ctctcaaaag 
ggggtggaca gtgacaccat ttggaatgag 
gttggctgtg tcatcgagct ggcttccaaa 
gctgttgtga ggccccctgg ccatcacgct 
tttaattcag ttgcaattac cgccaaatac 
ttgattgtag atctggatgt tcaccatgga 
cccagcatcc tgtacatttc actccatcgc 
ggagccccaa atgaggttcg gtttatttct 
ggtaattgca ttgcatgatt acccctaatt 
acgagattac tgaattgtcc catgggacca 
gagcactgtt tgtcagggaa ggttgggctg 
gagctcccat gtgcttgttt tcctctcttc 
ccaccgtggt gtgtctttct cttcccaggt 
aaatattgcc tggacaggtg gccttgatcc 
attcaggacc atcgtgaagc ctgtggccaa 
tgctggattt gatgcattgg aaggccacac 
aaaatgtttt ggtcatttga cgaagcaatt 
ggctctagaa ggaggacatg atctcacagc 
tgcccttcta ggaaatgagc tggagccact 
tatgaatgct gttatttctt tacagaagat 
ctcttaa 



cctgggcagt atggaggcag catcccggca 1140 
ggaaagccac ccaacagcag ccaccaggct 1200 
atgcgacagc aaaagcttct tgtagctggt 1260 
gcaacaaaag agagaatttc acctggcatt 1320 
cccctgaacc gaacccagtc tgcacctttg 1380 
caacagcaac accagcaatt cttggagaag 1440 
aacaaactgc tttcgaaatc tattgaacaa 1500 
gcagaggaag agcttcaggg ggaccaggcg 1560 
aacagcacta ggagcgacag cagtgcttgt 1620 
gtgaaggtca aggaggaacc agtggacagt 1680 
tctggggagc aggctgcttt tatgcaacag 1740 
ctctctgtgc gccaagctcc gctggctgcg 1800 
ctcgtctcca ggactcactc ttcccctgct 1860 
cgccccctcc agcctggctc tgcaactgga 1920 
cagtgcgttt gtggcaattc caccacccac 1980 
tggtcacgac tgcaagaaac tgggctgcta 2040 
gccagcctgg aggaaataca get tgt teat 2100 
aaccccctgg aeggacagaa gctggacccc 2160 
tttttttcct cattaccttg tggtggactt 2220 
ctacactcgt ccggtgctgc aegcatgget 2280 
gtggcctcag gagagctgaa gaatgggttt 2340 
gaagaatcca cagccatggg gttctgcttt 2400 
ttgagagacc aactaaatat aagcaagata 2460 
aacggtaccc ageaggcett ttatgetgae 2520 
tatgatgaag ggaacttttt ccctggcagt 2580 
ttagagcccc acttttattt gtatctttca 2640 
ttcttgtcct ttgctggtgt tttaaattac 2700 
agaaccagtg cagaacaagt gcataaccca 2760 
atttgatgtg ttgtttgatg tttatttcaa 2820 
ttgetttett ccatttgctc tcttctctgc 2880 
tggaacaggc cttggagaag ggtacaatat 2940 
tcccatggga gatgttgagt accttgaagc 3000 
agagtttgat ccagacatgg tcttagtatc 3 060 
ccctcctcta ggagggtaca aagtgacggc 3120 
gatgacattg gctgatggac gtgtggtgtt 3180 
catctgtgat gcatcagaag cctgtgtaaa 3240 
tgcagaagat attctccacc aaagcccgaa 3300 
cattgaaatt caaagtatgt ctttaaagtt 3360 

3367 



<210> 8 

<211> 835 

<212> PRT 

<213> Homo sapiens 



<400> 8 



Met 


His 


Ser 


Met 


He 


Ser 


Ser 


Val 


Asp 


Val 


Lys 


Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 










15 




Gly 


Leu 


Glu 


Pro 


He 


Ser 


Pro 


Leu 


Asp 


Leu 


Arg Thr 


Asp 


Leu 


Arg 


Met 








20 










25 










30 






Met 


Met 


Pro 


Val 


Val 


Asp 


Pro 


Val 


Val 


Arg 


Glu Lys 


Gin 


Leu 


Gin 


Gin 






35 










40 










45 








Glu 


Leu 
50 


Leu 


Leu 


He 


Gin 


Gin 
55 


Gin 


Gin 


Gin 


He 


Gin 
60 


Lys 


Gin 


Leu 


Leu 


He 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu 


Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 










80 


Ala 


Gin 


Leu 


Gin 


Glu 


His 


He 


Lys 


Glu 


Leu 


Leu 


Ala 


He 


Lys 


Gin 


Gin 










85 








90 










95 




Gin 


Glu 


Leu 


Leu 
100 


Glu 


Lys 


Glu 


Gin 


Lys 
105 


Leu 


Glu 


Gin 


Gin 


Arg 
110 


Gin 


Glu 


Gin 


Glu 


Val 
115 


Glu 


Arg 


His 


Arg 


Arg 
120 


Glu 


Gin 


Gin 


Leu 


Pro 
125 


Pro 


Leu 


Arg 
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Gly 


Lys 


Asp 


Arg Gly Arg Glu Arg 




130 




135 




Gin 


Lys 


Leu 


Gin Glu Phe Leu 


Leu 


145 






150 




Pro 


Thr 


Asn 


Gly Lys Asn His 


Ser 








165 




Tyr 


Thr 


Ala 


Ala His His Thr 


Ser 








180 




Ser 


Gly 


Thr 


Ser Pro Ser Tyr 


Lys 






195 




200 


Ala 


Lys 


Asp 


Asp Phe Pro Leu 


Arg 




210 




215 




Ser 


Ser 


Pro 


Gly Ser Gly Pro 


Ser 


225 






230 




Ser 


Val 


Thr 


Glu Asn Glu Thr 


Ser 








245 




Glu 


Gin 


Met 


Val Ser Gin Gin Arg 








260 




Asn 


Leu 


Leu 


Ser Leu Tyr Thr Ser 






275 




280 


Gly 


Leu 


Pro 


Ala Val Pro Ser 


Gin 




290 




295 




Glu 


Lys 


Gin 


Lys Cys Glu Thr 


Gin 


305 






310 




Pro 


Gly 


Gin 


Tyr Gly Gly Ser lie 








325 




Val 


Thr 


Leu 


Glu Gly Lys Pro 


Pro 








340 




Gin 


His 


Leu 


Leu Leu Lys Glu 


Gin 






355 




360 


Ala 


Gly 


Gly 


Val Pro Leu His 


Pro 




370 




375 




Arg 


lie 


Ser 


Pro Gly lie Arg Gly 


385 






390 




Pro 


Leu 


Asn 


Arg Thr Gin Ser 


Ala 








405 




Gin 


Leu 


Val 


lie Gin Gin Gin 


His 








420 




Gin 


Tyr 


Gin 


Gin Gin lie His 


Met 






435 




440 


Glu 


Gin 


Leu 


Lys Gin Pro Gly Ser 




450 




455 




Leu 


Gin 


Gly 


Asp Gin Ala Met 


Gin 


465 






470 




Asn 


Ser 


Thr 


Arg Ser Asp Ser 


Ser 








485 




Gin 


Val 


Gly 


Ala Val Lys Val Lys 








500 




Asp 


Ala 


Gin 


He Gin Glu Met 


Glu 






515 




520 


Gin 


Gin 


Pro 


Phe Leu Glu Pro 


Thr 




530 




535 




Gin 


Ala 


Pro 


Leu Ala Ala Val 


Gly 


545 






550 




Leu 


Val 


Ser 


Arg Thr His Ser 


Ser 








565 




Pro 


Ala 


Met 


Asp Arg Pro Leu 


Gin 








580 




Tyr 


Asp 


Pro 


Leu Met Leu Lys 


His 






595 




600 


Thr 


His 


Pro 


Glu His Ala Gly Arg 




610 




615 





Ala 


Val 


Ala 


Ser Thr Glu Val Lys 








140 




Ser 


Lys 


Ser 


Ala Thr Lys Asp Thr 






155 




160 


Val 


Ser 


Arg 


His Pro Lys 


Leu Trp 




170 






175 


Leu 


Asp 


Gin 


Ser Ser Pro 


Pro Leu 


185 






190 




Tyr 


Thr 


Leu 


Pro Gly Ala 


Gin Asp 








205 




Lys. 


Thr 


Glu 


Ser Ser Val 


Ser Ser 








220 




Ser 


Pro 


Asn 


Asn Gly Pro 


Thr Gly 






235 




240 


Val 


Leu 


Pro 


Pro Thr Pro 


His Ala 




250 






255 


He 


Leu 


He 


His Glu Asp 


Ser Met 


265 






270 




Pro 


Ser 


Leu 


Pro Asn He 


Thr Leu 








285 




Leu 


Asn 


Ala 


Ser Asn Ser 


Leu Lys 








300 




Thr 


Leu 


Arg 


Gin Gly Val 


Pro Leu 






315 




320 


Pro 


Ala 


Ser 


Ser Ser His 


Pro His 




330 






335 


Asn 


Ser 


Ser 


His Gin Ala 


Leu Leu 


345 






350 




Met 


Arg 


Gin 


Gin Lys Leu 


Leu Val 








365 




Gin 


Ser 


Pro 


Leu Ala Thr 


Lys Glu 








380 




Thr 


His 


Lys 


Leu Pro Arg 


His Arg 






395 




400 


Pro 


Leu 


Pro 


Gin Ser Thr 


Leu Ala 




410 






415 


Gin 


Gin 


Phe 


Leu Glu Lys 


Gin Lys 


425 






430 




Asn 


Lys 


Leu 


Leu Ser Lys 


Ser He 








445 




His 


Leu 


Glu 


Glu Ala Glu 


Glu Glu 








460 




Glu 


Asp 


Arg 


Ala Pro Ser 


Ser Gly 






475 




480 


Ala 


Cys 


Val 


Asp Asp Thr Leu Gly 




490 






495 


Glu 


Glu 


Pro 


Val Asp Ser 


Asp Glu 


505 






510 




Ser 


Gly 


Glu 


Gin Ala Ala 


Phe Met 








525 




His 


Thr 


Arg 


Ala Leu Ser Val Arg 








540 




Met 


Asp 


Gly 


Leu Glu Lys 


His Arg 






555 




560 


Pro 


Ala 


Ala 


Ser Val Leu 


Pro His 




570 






575 


Pro 


Gly 


Ser 


Ala Thr Gly He Ala 


585 






590 




Gin 


Cys 


Val 


Cys Gly Asn Ser Thr 








605 




He 


Gin 


Ser 


He Trp Ser Arg Leu 



620 
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Gin 


Glu 


Thr 


Gly Leu 


Leu Asn Lys Cys 


Glu 


Arg lie Gin 


Gly Arg Lys 


625 






630 




635 






640 


Ala 


Ser 


Leu 


Glu Glu 


He Gin Leu Val 


His 


Ser Glu His 


His 


Ser 


Leu 








645 




650 






655 




Leu 


Tyr 


Gly 


Thr Asn 


Pro Leu Asp Gly 


Gin 


Lys Leu Asp 


Pro 


Arg 


He 




660 


665 






670 






Leu 


Leu 


Gly 


Asp Asp 


Ser Gin Lys Phe 


Phe 


Ser Ser Leu 


Pro 


Cys 


Gly 






675 




680 




685 








Gly 


Leu 


Gly 


Val Asp 


Ser Asp Thr He 


Trp 


Asn Glu Leu 


His 


Ser 


Ser 


690 






695 




700 








Gly 


Ala 


Ala 


Arg Met 


Ala Val Gly Cys 


Val 


He Glu Leu 


Ala 


Ser 


Lys 


705 






710 




715 






720 


Val 


Ala 


Ser 


Gly Glu 


Leu Lys Asn Gly 


Phe 


Ala Val Val 


Arg 


Pro 


Pro 








725 




730 






735 




Gly 


His 


His 


Ala Glu 


Glu Ser Thr Ala 


Met 


Gly Phe Cys 


Phe 


Phe 


Asn 






740 


745 






750 






Ser 


Val 


Ala 


He Thr 


Ala Lys Tyr Leu 


Arg 


Asp Gin Leu 


Asn 


He 


Ser 






755 




760 




765 








Lys 


lie 


Leu 


He Val 


Asp Leu Asp Val 


His 


His Gly Asn 


Gly Thr Gin 


770 






775 




780 








Gin 


Ala 


Phe 


Tyr Ala 


Asp Pro Ser He 


Leu 


Tyr lie Ser 


Leu 


His 


Arg 


785 






790 




795 






800 


Tyr 


Asp 


Glu 


Gly Asn 


Phe Phe Pro Gly 


Ser 


Gly Ala Pro 


Asn 


Glu Val 




805 




810 






815 




Arg 


Phe 


He 


Ser Leu 


Glu Pro His Phe 


Tyr 


Leu Tyr Leu 


Ser Gly Asn 






820 


825 






830 






Cys 


lie 


Ala 




















835 

















<210> 9 

<211> 1791 

<212> DNA 

<213> Homo sapiens 

<400> 9 

ggggaagaga ggcacagaca cagataggag 
tgagggtttt tgcaacaaaa ccctagcagc 
ggacgagagc agctcttggc tcagcaaaga 
aagtcagaag ttcctgtggg cctggagccc 
aggatgatga tgcccgtggt ggaccctgtt 
cttcttatcc agcagcagca acaaatccag 
cagcatgaga acttgacacg gcagcaccag 
ctagccataa aacagcaaca agaactccta 
caagaacagg aagtagagag gcatcgcaga 
gatagaggac gagaaagggc agtggcaagt 
ctactgagta aatcagcaac gaaagacact 
cgccatccca agctctggta cacggctgcc 
ccccttagtg gaacatctcc atcctacaag 
gatgatttcc cccttcgaaa aactgaatcc 
cccagttcac caaacaatgg gccaactgga 
ccccctaccc ctcatgccga gcaaatggtt 
tccatgaacc tgctaagtct ttatacctct 
cccgcagtgc catcccagct caatgcttcg 
acgcagacgc ttaggcaagg tgttcctctg 
tcttccagcc accctcatgt tactttagag 
ctcctgcagc atttattatt gaaagaacaa 
ggagttccct tacatcctca gtctcccttg 
agaggtaccc acaaattgcc ccgtcacaga 
cctcagagca cgttggctca gctggtcatt 
cagaagcaat accagcagca gatccacatg 
ctgaagcaac caggcagtca ccttgaggaa 



aagggcaccg gctggagcca cttgcaggac 60 
ctgaagaact ctaagccaga tggggtggct 120 
atgcacagta tgatcagctc agtggatgtg 180 
atctcacctt tagacctaag gacagacctc 240 
gtccgtgaga agcaattgca gcaggaatta 300 
aagcagcttc tgatagcaga gtttcagaaa 360 
get cage ttc aggagcatat caaggaactt 420 
gaaaaggagc agaaactgga gcagcagagg 480 
gaacagcagc ttcctcctct cagaggcaaa 540 
acagaagtaa agcagaagct tcaagagttc 600 
ccaactaatg gaaaaaatca ttccgtgagc 660 
caccacacat cattggatca aagctctcca 720 
tacacattac caggagcaca agatgeaaag 780 
tcagtcagta gcagttctcc aggctctggt 840 
agtgttactg aaaatgagac ttcggttttg 900 
tcacagcaac gcattctaat tcatgaagat 960 
ccttctttgc ccaacattac cttggggctt 1020 
aattcactca aagaaaagca gaagtgtgag 1080 
cctgggcagt atggaggcag catcccggca 1140 
ggaaagecac ccaacagcag ccaccaggct 1200 
atgegacage aaaagcttct tgtagctggt 1260 
gcaacaaaag agagaatttc acctggcatt 1320 
cccctgaacc gaacccagtc tgcacctttg 1380 
caacagcaac accagcaatt cttggagaag 1440 
aacaaactgc tttcgaaatc tattgaacaa 1500 
gcagaggaag agcttcaggg ggaccaggcg 1560 
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atgcaggaag acagagcgcc ctctagtggc aacagcacta ggagcgacag cagtgcttgt 1620 
gtggatgaca cactgggaca agttggggct gtgaaggtca aggaggaacc agtggacagt 1680 
gatgaagatg ctcagatcca ggaaatggaa tctggggagc aggctgcttt tatgcaacag 1740 
gtaataggca aagatttagc tccaggattt gtaattaaag tcattatctg a 1791 

<210> 10 
<211> 546 
<212> PRT 

<213> Homo sapiens 



<400> 10 



Met 


His 


Ser 


Met 


He 


Ser 


Ser 


Val 


Asp 


Val 


Lys 


Ser 


Glu 


Val 


Pro 


Val 


1 








5 










10 










15 




Gly 


Leu 


Glu 


Pro 


He 


Ser 


Pro 


Leu 


A SP 


Leu 


Arg Thr 


Asp 


Leu 


Arg 


Met 








20 










25 










30 






Met 


Met 


Pro 


Val 


Val 


Asp 


Pro 


Val 


Val 


Arg 


Glu 


Lys 


Gin 


Leu 


Gin 


Gin 






35 










40 










45 








Glu 


•Leu 


Leu 


Leu 


He 


Gin 


Gin 


Gin 


Gin 


Gin 


He 


Gin 


Lys 


Gin 


Leu 


Leu 




50 










55 










60 










lie 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


Glu 


Asn 


Leu 


Thr 


Arg 


Gin 


His 


Gin 


65 










70 










75 










80 


Ala 


Gin 


Leu 


Gin 


Glu 


His 


He 


Lys 


Glu 


Leu 


Leu 


Ala 


He 


Lys 


Gin 


Gin 










85 










90 










95 




Gin 


Glu 


Leu 


Leu 


Glu 


Lys 


Glu 


Gin 


Lys 


Leu 


Glu 


Gin 


Gin 


Arg 


Gin 


Glu 








100 










105 










110 






Gin 


Glu 


Val 


Glu 


Arg 


His 


Arg 


Arg 


Glu 


Gin 


Gin 


Leu 


Pro 


Pro 


Leu Arg 






115 










120 










125 








Gly 


Lys 


Asp 


Arg 


Gly 


Arg 


Glu 


Arg 


Ala 


Val 


Ala 


Ser 


Thr 


Glu 


Val 


Lys 




130 










135 










140 










Gin 


Lys 


Leu 


Gin 


Glu 


Phe 


Leu 


Leu 


Ser 


Lys 


Ser 


Ala 


Thr 


Lys 


Asp 


Thr 


145 










150 










155 










160 


Pro 


Thr 


Asn 


Gly 


Lys 


Asn 


His 


Ser 


Val 


Ser 


Arg His 


Pro 


Lys 


Leu 


Trp 










165 










170 










175 




Tyr 


Thr 


Ala 


Ala 


His 


His 


Thr 


Ser 


Leu 


Asp 


Gin 


Ser 


Ser 


Pro 


Pro 


Leu 








180 










185 










190 






Ser 


Gly 


Thr 


Ser 


Pro 


Ser 


Tyr 


Lys 


Tyr 


Thr 


Leu Pro Gly 


Ala 


Gin Asp 






195 










200 










205 








Ala 


Lys 


Asp 


Asp 


Phe 


Pro 


Leu 


Arg 


Lys 


Thr 


Glu 


Ser 


Ser 


Val 


Ser 


Ser 




210 










215 










220 










Ser 


Ser 


Pro 


Gly 


Ser 


Gly 


Pro 


Ser 


Ser 


Pro 


Asn Asn Gly 


Pro 


Thr Gly 


225 










230 










235 










240 


Ser 


Val 


Thr 


Glu 


Asn 


Glu 


Thr 


Ser 


Val 


Leu 


Pro 


Pro 


Thr 


Pro 


His 


Ala 










245 










250 










255 




Glu 


Gin 


Met 


Val 


Ser 


Gin 


Gin 


Arg 


He 


Leu 


He 


His 


Glu 


Asp 


Ser 


Met 








260 










265 










270 






Asn 


Leu 


Leu 


Ser 


Leu 


Tyr 


Thr 


Ser 


Pro 


Ser 


Leu 


Pro 


Asn 


He 


Thr 


Leu 






275 










280 










285 








Gly 


Leu 


Pro 


Ala 


Val 


Pro 


Ser 


Gin 


Leu 


Asn 


Ala 


Ser 


Asn 


Ser 


Leu 


Lys 




290 










295 










300 








Glu 


Lys 


Gin 


Lys 


Cys 


Glu 


Thr 


Gin 


Thr 


Leu 


Arg Gin Gly 


Val 


Pro 


Leu 


305 










310 










315 










320 


Pro 


Gly 


Gin 


Tyr 


Gly 


Gly 


Ser 


He 


Pro 


Ala 


Ser 


Ser 


Ser 


His 


Pro 


His 










325 










330 










335 




Val 


Thr 


Leu 


Glu 


Gly 


Lys 


Pro 


Pro 


Asn 


Ser 


Ser 


His 


Gin 


Ala 


Leu 


Leu 








340 










345 










350 






Gin 


His 


Leu 


Leu 


Leu 


Lys 


Glu 


Gin 


Met 


Arg 


Gin Gin Lys 


Leu 


Leu 


Val 






355 










360 










365 








Ala 


Gly 


Gly Val 


Pro 


Leu 


His 


Pro 


Gin 


Ser 


Pro 


Leu Ala 


Thr 


Lys 


Glu 




370 










375 










380 










Arg 


He 


Ser 


Pro 


Gly 


He 


Arg 


Gly 


Thr 


His 


Lys 


Leu 


Pro 


Arg 


His 


Arg 


385 










390 










395 










400 


Pro 


Leu 


Asn Arg 


Thr 


Gin 


Ser 


Ala 


Pro 


Leu 


Pro 


Gin 


Ser 


Thr 


Leu 


Ala 
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405 








Gin 


Leu 


Val He Gin 


Gin 


Gin 


His 






420 








Gin 


Tyr 


Gin Gin Gin 


He 


His 


Met 






435 






440 


Glu 


Gin 


Leu Lys Gin 


Pro Gly 


Ser 




450 






455 




Leu 


Gin 


Gly Asp Gin 


Ala 


Met 


Gin 


465 






470 






Asn 


Ser 


Thr Arg Ser 


Asp 


Ser 


Ser 






485 








Gin 


Val 


Gly Ala Val 


Lys Val 


Lys 






500 








Asp 


Ala 


Gin He Gin 


Glu 


Met 


Glu 






515 






520 


Gin 


Gin 


Val He Gly 


Lys 


Asp 


Leu 




530 






535 




He 


He 










545 













<210> 11 
<211> 590 
<212> PRT 

<213> Homo sapiens 



<400> 11 



Met 


His 


Ser 


Met 


He 


Ser 


Ser 


Val 


1 








5 








Gly 


Leu 


Glu 


Pro 


He 


Ser 


Pro 


Leu 








20 










Met 


Met 


Pro 


Val 


Val 


Asp 


Pro 


Val 






35 










40 


Glu 


Leu 


Leu 


Leu 


He 


Gin 


Gin 


Gin 




50 










55 




lie 


Ala 


Glu 


Phe 


Gin 


Lys 


Gin 


His 


65 










70 






Ala 


Gin 


Leu 


Gin 


Glu 


His 


He 


Lys 










85 








Gin 


Glu 


Leu 


Leu 


Glu 


Lys 


Glu 


Gin 








100 










Gin 


Glu 


Val 


Glu 


Arg 


His 


Arg 


Arg 






115 










120 


Gly 


Lys 


Asp 


Arg 


Gly 


Arg 


Glu 


Arg 




130 










135 




Gin 


Lys 


Leu 


Gin 


Glu 


Phe 


Leu 


Leu 


145 










150 






Pro 


Thr 


Asn 


Gly 


Lys 


Asn 


His 


Ser 










165 








Tyr 


Thr 


Ala 


Ala 


His 


His 


Thr 


Ser 








180 










Ser 


Gly 


Thr 


Ser 


Pro 


Ser 


Tyr 


Lys 






195 










200 


Ala 


Lys 


Asp 


Asp 


Phe 


Pro 


Leu 


Arg 




210 










215 




Lys 


Val 


Arg 


Ser 


Arg 


Leu Lys 


Gin 


225 










230 






Pro 


Leu 


Leu 


Arg 


Arg 


Lys Asp 


Gly 










245 








Arg 


Met 


Phe 


Glu 


Val 


Thr Glu 


Ser 








260 










Ser 


Gly 


Pro 


Ser 


Ser 


Pro 


Asn 


Asn 
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410 




415 




Gin 


Gin 


Phe Leu Glu Lys 


Gin 


Lys 


425 




430 






Asn 


Lys 


Leu Leu Ser Lys 


Ser 


He 






445 






His 


Leu 


Glu Glu Ala Glu 


Glu 


Glu 






460 






Glu 


Asp 


Arg Ala Pro Ser 


Ser 


Gly 






475 




480 


Ala 


Cys 


Val Asp Asp Thr 


Leu 


Gly 




490 




495 




Glu 


Glu 


Pro Val Asp Ser 


Asp 


Glu 


505 




510 






Ser 


Gly 


Glu Gin Ala Ala 


Phe 


Met 






525 






Ala 


Pro 


Gly Phe Val He 


Lys 


Val 






540 






Asp 


Val 


Lys Ser Glu Val 


Pro 


Val 




10 




15 




Asp 


Leu 


Arg Thr Asp Leu 


Arg 


Met 


25 




30 






Val 


Arg 


Glu Lys Gin Leu 


Gin 


Gin 






45 






Gin 


Gin 


He Gin Lys Gin 


Leu 


Leu 






60 






Glu 


Asn 


Leu Thr Arg Gin 


His 


Gin 






75 




80 


Glu 


Leu 


Leu Ala lie Lys 


Gin 


Gin 




90 




95 




Lys 


Leu 


Glu Gin Gin Arg 


Gin 


Glu 


105 




110 






Glu 


Gin 


Gin Leu Pro Pro 


Leu 


Arg 






125 






Ala 


Val 


Ala Ser Thr Glu 


Val 


Lys 






140 






Ser 


Lys 


Ser Ala Thr Lys 


Asp 


Thr 






155 




160 


Val 


Ser 


Arg His Pro Lys 


Leu 


Trp 




170 




175 




Leu 


Asp 


Gin Ser Ser Pro 


Pro 


Leu 


loo 




' 1 Oft 






Tyr 


Thr 


Leu Pro Gly Ala 


Gin 


Asp 






205 






Lys 


Thr 


Ala Ser Glu Pro 


Asn 


Leu 




220 






Lys 


Val 


Ala Glu Arg Arg 


Ser 


Ser 






235 




240 


Asn 


Val 


Val Thr Ser Phe 


Lys 


Lys 




250 




255 




Ser 


Val 


Ser Ser Ser Ser 


Pro 


Gly 


265 




270 






Gly 


Pro 


Thr Gly Ser Val 


Thr 


Glu 
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275 










280 


Asn 


Glu 
290 


Thr 


Ser 


Val 


Leu 


Pro 
295 


Pro 


Ser 


Gin 


Gin 


Arg 


He 


Leu 


He 


His 


305 










310 






Leu 


Tyr 


Thr 


Ser 


Pro 
325 


Ser 


Leu 


Pro 


Val 


Pro 


Ser 


Gin 
340 


Leu 


Asn 


Ala 


Ser 


Cys 


Glu 


Thr 


Gin 


Thr 


Leu Arg Gin 






355 










360 


Gly 


Gly 
370 


Ser 


He 


Pro 


Ala 


Ser 
375 


Ser 


Gly 


Lys 


Pro 


Pro 


Asn 


Ser 


Ser 


His 


385 










390 






Leu 


Lys 


Glu 


Gin 


Met 
405 


Arg 


Gin 


Gin 


Pro 


Leu 


His 


Pro 
420 


Gin 


Ser 


Pro 


Leu 


Gly 


He 


Arg 


Gly Thr 


His 


Lys 


Leu 






435 










440 


Thr 


Gin 
450 


Ser 


Ala 


Pro 


Leu 


Pro 
455 


Gin 


Gin 


Gin 


Gin 


His 


Gin 


Gin 


Phe 


Leu 


465 










470 






Gin 


He 


His 


Met 


Asn 
485 


Lys 


Leu 


Leu 


Gin 


Pro 


Gly 


Ser 
500 


His 


Leu 


Glu 


Glu 


Gin 


Ala 


Met 


Gin 


Glu 


Asp Arg Ala 






515 










520 


Ser 


Asp 


Ser 


Ser 


Ala 


Cys Val 


Asp 




530 










535 




Val 


Lys 


Val 


Lys 


Glu 


Glu 


Pro 


Val 


545 










550 






Gin 


Glu 


Met 


Glu 


Ser 


Gly Glu Gin 










565 








Gly 


Lys 


Asp 


Leu Ala 


Pro Gly Phe 








580 











<210> 12 
<211> 1084 
<212> PRT 

<213> Homo sapiens 



<400> 12 



Met 


Ser 


Ser 


Gin 


Ser 


His Pro Asp 


1 








5 




Val 


Glu 


Leu 


Leu 


Asn 


Pro Ala Arg 








20 






Asp 


Val 


Ala 


Thr 


Ala 


Leu Pro Leu 






35 






40 


Met 


Asp 


Leu 


Arg 


Leu 


Asp His Gin 




50 








55 


Ala 


Leu 


Arg 


Glu 


Gin 


Gin Leu Gin 


65 










70 


Lys 


Gin 


Gin 


He 


Gin 


Arg Gin He 










85 




His 


Glu 


Gin 


Leu 


Ser 


Arg Gin His 








100 






Lys 


Gin 


Gin 


Gin 


Glu 


Met Leu Ala 



285 



Thr 


Pro 


His Ala Glu Gin 


Met 


Val 






300 






Glu 


Asp 


Ser Met Asn Leu 


Leu 


Ser 






315 




320 


Asn 


He 


Thr Leu Gly Leu 


Pro 


Ala 




330 




335 




Asn 


Ser 


Leu Lys Glu Lys 


Gin 


Lys 


345 




350 






Gly 


Val 


Pro Leu Pro Gly 


Gin 


Tyr 






365 






Ser 


His 


Pro His Val Thr 


Leu 


Glu 






380 






Gin 


Ala 


Leu Leu Gin His 


Leu 


Leu 






395 




400 


Lys 


Leu 


Leu Val Ala Gly 


Gly Val 




410 




415 




Ala 


Thr 


Lys Glu Arg He 


Ser 


Pro 


425 




430 






Pro 


Arg 


His Arg Pro Leu 


Asn 


Arg 






445 






Ser 


Thr 


Leu Ala Gin Leu 


Val 


He 






460 






Glu 


Lys 


Gin Lys Gin Tyr 


Gin 


Gin 






475 




480 


Ser 


Lys 


Ser He Glu Gin 


Leu 


Lys 




490 




495 




Ala 


Glu 


Glu Glu Leu Gin 


Gly Asp 


505 




510 






Pro 


Ser 


Ser Gly Asn Ser 


Thr 


Arg 






525 






Asp 


Thr 


Leu Gly Gin Val 


Gly Ala 






540 






Asp 


Ser 


Asp Glu Asp Ala 


Gin 


He 






555 




560 


Ala 


Ala 


Phe Met Gin Gin 


Val 


He 




570 




575 




Val 


He 


Lys Val He He 






585 




590 






Gly 


Leu 


Ser Gly Arg Asp 


Gin 


Pro 




10 




15 




Val 


Asn 


His Met Pro Ser 


Thr 


Val 


25 




30 






Gin 


Val 


Ala Pro Ser Ala 


Val 


Pro 






45 






Phe 


Ser 


Leu Pro Val Ala 


Glu 


Pro 






60 






Gin 


Glu 


Leu Leu Ala Leu 


Lys 


Gin 






75 




80 


Leu 


He 


Ala Glu Phe Gin 


Arg 


Gin 




90 




95 




Glu 


Ala 


Gin Leu His Glu 


His 


He 


105 




110 






Met 


Lys 


His Gin Gin Glu 


Leu 


Leu 
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115 



Glu 


His 


Gin 


Arg 


Lys 


Leu 




130 










Lys 


Gin 


His 


Arg 


Glu 


Gin 


145 










150 


Gly 


Lys 


Glu 


Ser 


Ala 


Val 










165 




Glu 


Phe. 


Val 


Leu 


Asn 


Lys 








180 






His 


Cys 


He 


Ser 


Ser Asp 






195 








Ser 


Ser 


Leu 


Asp 


Gin 


Ser 




210 










Tyr 


Asn 


His 


Pro 


Val 


Leu 


225 










230 


Leu 


Arg 


Lys 


Thr 


Ala 


Ser 










245 




Lys 


Gin 


Lys 


Val 


Ala 


Glu 








260 






Asp 


Gly 


Pro 


Val 


Val 


Thr 






275 








Asp 


Ser 


Ala 


Cys 


Ser 


Ser 




290 










Asn 


Ser 


Ser 


Gly 


Ser 


Val 


305 










310 


Pro 


Ser 


He 


Pro 


Ala 


Glu 










325 




Glu 


Gly 


Ser 


Ala 


Ala 


Pro 








340 






Asn 


lie 


Thr 


Leu 


Gly Leu 






355 








Gly 


Gin 


Gin 


Asp 


Thr 


Glu 




370 










Leu 


Ser 


Leu 


Phe 


Pro 


Gly 


385 










390 


Pro 


Leu 


Glu * Arg 


Asp 


Gly 










405 




Met 


Val 


Leu 


Leu 


Glu 


Gin 








420 






Leu 


Gly 


Ala 


Leu 


Pro 


Leu 






435 








Val 


Ser 


Pro 


Ser 


He 


His 




450 










Thr 


Gin 


Ser 


Ala 


Pro 


Leu 


465 










470 


Val 


He 


Gin 


Gin 


Gin 


His 










485 




Phe 


Gin 


Gin 


Gin 


Gin 


Leu 








500 






Glu 


Pro 


Ala 


Arg 


Gin 


Pro 






515 








Leu 


Arg 


Glu 


His 


Gin 


Ala 




530 










Pro 


Gly 


Gin 


Lys 


Glu 


Ala 


545 










550 


Glu 


Pro 


He 


Glu 


Ser 


Asp 










565 




Glu 


Pro 


Gly 


Gin 


Arg 


Gin 








580 






Gin 


Ala 


Leu 


Leu 


Leu 


Glu 






595 








Gin 


Ala 


Ser 


Met 


Glu Ala 
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120 



Glu Arg 


His 


Arg ' Gin 


Glu 


135 










140 


Lys 


Leu 


Gin 


Gin 


Leu 


Lys 










155 




Ala 


Ser 


Thr 


Glu 


Val 


Lys 








170 






Lys 


Lys 


Ala 


Leu 


Ala 


His 






185 








Pro 


Arg 


Tyr 


Trp Tyr 


Gly 




200 










Ser 


Pro 


Pro 


Gin 


Ser 


Gly 


215 










220 


Gly Met 


Tyr 


Asp Ala 


Lys 










235 




Glu 


Pro 


Asn 


Leu Lys 


Leu 








250 






Arg 


Arg 


Ser 


Ser 


Pro 


Leu 






265 








Ala 


Leu 


Lys 


Lys 


Arg 


Pro 




280 










Ala 


Pro 


Gly 


Ser Gly 


Pro 


295 










300 


Ser 


Ala 


Glu 


Asn Gly 


He 










315 




Thr 


Ser 


Leu 


Ala 


His 


Arg 








330 






Leu 


Pro 


Leu 


Tyr 


Thr 


Ser 






345 








Pro 


Ala 


Thr 


Gly 


Pro 


Ser 




360 










Arg Leu 


Thr 


Leu 


Pro 


Ala 


375 










380 


Thr 


His 


Leu 


Thr 


Pro 


Tyr 










395 




Gly Ala 


Ala 


His 


Ser 


Pro 








410 






Pro 


Pro 


Ala 


Gin 


Ala 


Pro 






425 








His 


Ala 


Gin 


Ser 


Leu 


Val 




440 










Lys 


Leu 


Arg 


Gin 


His 


Arg 


455 










460 


Pro 


Gin 


Asn 


Ala 


Gin 


Ala 










475 




Gin 


Gin 


Phe 


Leu 


Glu 


Lys 








490 






Gin 


Met 


Asn 


Lys 


He 


He 






505 








Glu 


Ser 


His 


Pro 


Glu 


Glu 




520 










Leu 


Leu 


Asp 


Glu 


Pro 


Tyr 


535 










540 


His 


Ala 


Gin 


Ala Gly 


Val 










555 




Glu 


Glu 


Glu 


Ala 


Glu 


Pro 








570 






Pro 


Ser 


Glu 


Gin 


Glu 


Leu 






585 








Gin 


Gin 


Arg 


He 


His 


Gin 




600 










Ala Gly 


He 


Pro Val 


Ser 



PCT7US02/19051 



125 








Gin 


Glu 


Leu 


Glu 


Asn 


Lys 


Glu 


Lys 








160 


Met 


Lys 


Leu 


Gin 






175 




Arg 


Asn 


Leu 


Asn 




190 






Lys 


Thr 


Gin 


His 










Val 


Ser 


Thr 


Ser 


Asp 


Asp 


Phe 


Pro 








240 


Arg 


Ser 


Arg 


Leu 






255 




Leu 


Arg 


Arg 


Lys 




270 






Leu 


Asp 


Val 


Thr 


285 








Ser 


Ser 


Pro 


Asn 


Ala 


Pro 


Ala 


Val 








320 


Leu 


Val 


Ala 


Arg 






335 




Pro 


Ser 


Leu 


Pro 




350 






Ala 


Gly 


Thr 


Ala 


O vj *J 








Leu 


Gin 


Gin 


Arg 


Leu 


Ser 


Thr 


Ser 








400 


Leu 


Leu 


Gin 


His 






415 




Leu 


Val 


Thr 


Gly 




430 






Gly 


Ala 


Asp 


Arg 


445 








Pro 


Leu 


Gly 


Arg 


Leu 


Gin 


His 


Leu 








480 


His 


Lys 


Gin 


Gin 






495 




Pro 


Lys 


Pro 


Ser 




510 






Thr 


Glu 


Glu 


Glu 


525 








Leu 


Asp 


Arg 


Leu 


Gin 


Val 


Lys 


Gin 








560 


Pro 


Arg 


Glu 


Val 






575 




Leu 


Phe 


Arg 


Gin 




590 






Leu 


Arg 


Asn 


Tyr 


605 








Phe 


Gly 


Gly 


His 



WO 02/102984 



PCT/US02/19051 



18/25 





610 










615 










620 










Arg 


Pro 


Leu 


Ser 


Arg Ala Gin 


Ser 


Ser 


Pro 


Ala 


Ser 


Ala 


Thr 


Phe 


Pro 


625 










630 










635 










640 


Val 


Ser 


Val 


Gin 


Glu 


Pro 


Pro 


Thr 


Lys 


Pro 


Arg 


Phe 


Thr 


Thr 


Gly 


Leu 










645 










650 










655 




Val 


Tyr 


Asp 


Thr 


Leu 


Met 


Leu 


Lys 


His 


Gin 


Cys 


Thr 


Cys 


Gly 


Ser 


Ser 








660 










665 










670 






Ser 


Ser 


His 


Pro 


Glu 


His 


Ala 


Gly 


Arg 


He 


Gin 


Ser 


He 


Trp 


Ser 


Arg 






675 










680 










685 








Leu 


Gin 


Glu 


Thr 


Gly Leu Arg 


Gly 


Lys 


Cys 


Glu 


Cys 


lie 


Arg 


Gly 


Arg 




690 










695 










700 










Lys 


Ala 


Thr 


Leu 


Glu 


Glu 


Leu 


Gin 


Thr 


Val 


His 


Ser 


Glu 


Ala 


His 


Thr 


705 










710 










715 










720 


Leu 


Leu 


Tyr 


Gly 


Thr 


Asn 


Pro 


Leu 


Asn 


Arg 


Gin 


Lys 


Leu 


Asp 


Ser 


Lys 










725 










730 










735 




Lys 


Leu 


Leu 


Gly 


Ser 


Leu 


Ala 


Ser 


Val 


Phe 


Val 


Arg 


Leu 


Pro 


Cys 


Gly 








740 










745 










750 






Gly 


Val 


Gly 


Val 


Asp 


Ser 


Asp 


Thr 


He 


Trp 


Asn 


Glu 


Val 


His 


Ser 


Ala 






755 










760 










765. 








Gly 


Ala 


Ala 


Arg 


Leu 


Ala 


Val 


Gly 


Cys 


Val 


Val 


Glu 


Leu 


Val 


Phe 


Lys 




770 










775 










780 










Val 


Ala 


Thr 


Gly 


Glu 


Leu 


Lys 


Asn 


Gly 


Phe 


Ala 


Val 


Val 


Arg 


Pro 


Pro 


785 










790 










795 










800 


Gly 


His 


His 


Ala 


Glu 


Glu 


Ser 


Thr 


Pro 


Met 


Gly 


Phe 


Cys 


Tyr 


Phe 


Asn 










805 










810 










815 




Ser 


Val 


Ala 


Val 


Ala 


Ala 


Lys 


Leu 


Leu 


Gin 


Gin 


Arg 


Leu 


Ser 


Val 


Ser 








820 










825 










830 






Lys 


lie 


Leu 


He 


Val 


Asp 


Trp 


Asp 


Val 


His 


His 


Gly 


Asn 


Gly 


Thr 


Gin 






835 










840 










845 








Gin 


Ala 


Phe 


Tyr 


Ser Asp 


Pro 


Ser 


Val 


Leu 


Tyr 


Met 


Ser 


Leu 


His 


Arg 




850 










855 










860 










Tyr 


Asp 


Asp 


Gly 


Asn 


Phe 


Phe 


Pro 


Gly 


Ser 


Gly 


Ala 


Pro 


Asp 


Glu 


Val 


865 










870 










875 










880 


Gly 


Thr 


Gly 


Pro 


Gly Val 


Gly 


Phe 


Asn 


Val 


Asn 


Met 


Ala 


Phe 


Thr 


Gly 










885 










890 










895 




Gly 


Leu 


Asp 


Pro 


Pro 


Met 


Gly 


Asp 


Ala 


Glu 


Tyr 


Leu 


Ala 


Ala 


Phe 


Arg 








900 










905 










910 






Thr 


Val 


Val 


Met 


Pro 


He 


Ala 


Ser 


Glu 


Phe 


Ala 


Pro 


Asp 


Val 


Val 


Leu 






915 










920 










925 








Val 


Ser 


Ser 


Gly 


Phe Asp Ala 


Val 


Glu 


Gly 


His 


Pro 


Thr 


Pro 


Leu 


Gly 




930 










935 










940 










Gly 


Tyr 


Asn 


Leu 


Ser Ala Arg 


Cys 


Phe 


Gly 


Tyr 


Leu 


Thr 


Lys 


Gin 


Leu 


945 










950 










955 










960 


Met 


Gly 


Leu 


Ala 


Gly Gly Arg 


He 


Val 


Leu 


Ala 


Leu 


Glu 


Gly 


Gly 


His 










965 










970 










975 




Asp 


Leu 


Thr 


Ala 


He 


Cys 


Asp 


Ala 


Ser 


Glu 


Ala 


Cys 


Val 


Ser 


Ala 


Leu 








980 










985 










990 






Leu 


Gly 


Asn 


Glu 


Leu Asp 


Pro 


Leu 


Pro 


Glu 


Lys 


Val 


Leu 


Gin 


Gin 


Arg 






995 










1000 








1005 






Pro 


Asn 


Ala 


Asn 


Ala 


Val 


Arg 


Ser 


Met 


Glu 


Lys 


Val 


Met 


Glu 


He 


His 




1010 








1015 








1020 








Ser 


Lys 


Tyr 


Trp 


Arg 


Cys 


Leu Gin Arg 


Thr 


Thr 


Ser 


Thr 


Ala 


Gly 


Arg 


1025 








1030 








1035 








1040 


Ser 


Leu 


He 


Glu 


Ala Gin Thr Cys Glu 


Asn 


Glu 


Glu 


Ala 


Glu 


Thr 


Val 










1045 








1050 








1055 


Thr Ala Met 


Ala 


Ser Leu Ser Val Gly 


Val 


Lys 


Pro 


Ala 


Glu 


Lys Arg 








1060 








1065 








1070 




Pro Asp Glu 


Glu 


Pro 


Met 


Glu 


Glu 


Glu 


Pro 


Pro 


Leu 











1075 1080 



<210> 13 
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<211> 3550 
<212> DNA 

<213> Homo sapiens 
<400> 13 

ggggaagaga ggcacagaca cagataggag 
tgagggtttt tgcaacaaaa ccctagcagc 
ggacgagagc agctcttggc tcagcaaaga 
aagtcagaag ttcctgtggg cctggagccc 
aggatgatga tgcccgtggt ggaccctgtt 
cttcttatcc agcagcagca acaaatccag 
cagcatgaga acttgacacg gcagcaccag 
ctagccataa aacagcaaca agaactccta 
caagaacagg aagtagagag gcatcgcaga 
gatagaggac gagaaagggc agtggcaagt 
ctactgagta aatcagcaac gaaagacact 
cgccatccca agctctggta cacggctgcc 
ccccttagtg gaacatctcc atcctacaag 
gatgatttcc cccttcgaaa aactgcctct 
aaacagaaag tggcagagag gagaagcagc 
gtcacttcat tcaagaagcg aatgtttgag 
ccaggctctg gtcccagttc accaaacaat 
acttcggttt tgccccctac ccctcatgcc 
attcatgaag attccatgaa cctgctaagt 
accttggggc ttcccgcagt gccatcccag 
cagaagtgtg agacgcagac gcttaggcaa 
agcatcccgg catcttccag ccaccctcat 
agccaccagg ctctcctgca gcatttatta 
cttgtagctg gtggagttcc cttacatcct 
tcacctggca ttagaggtac ccacaaattg 
tctgcacctt tgcctcagag cacgttggct 
ttcttggaga agcagaagca ataccagcag 
tctattgaac aactgaagca accaggcagt 
ggggaccagg cgatgcagga agacagagcg 
agcagtgctt gtgtggatga cacactggga 
ccagtggaca gtgatgaaga tgctcagatc 
tttatgcaac aggtaatagg caaagattta 
tgacctttcc tggaacccac gcacacacgt 
gcggttggca tggatggatt agagaaacac 
gctgcctctg ttttacctca cccagcaatg 
ggaattgcct atgacccctt gatgctgaaa 
caccctgagc atgctggacg aatacagagt 
ctaaataaat gtgagcgaat tcaaggtcga 
cattctgaac atcactcact gttgtatggc 
cccaggatac tcctaggtga tgactctcaa 
cttggggtgg acagtgacac catttggaat 
gctgttggct gtgtcatcga gctggcttcc 
tttgctgttg tgaggccccc tggccatcac 
ttttttaatt cagttgcaat taccgccaaa 
atattgattg tagatctgga tgttcaccat 
gaccccagca tcctgtacat ttcactccat 
agtggagccc caaatgaggt tcggtttatt 
tcaggtaatt gcattgcatg attaccccta 
tacacgagat tactgaattg tcccatggga 
ccagagcact gtttgtcagg gaaggttggg 
caagagctcc catgtgcttg ttttcctctc 
tgcccaccgt ggtgtgtctt tctcttccca 
tataaatatt gcctggacag gtggccttga 
agcattcagg accatcgtga agcctgtggc 
atctgctgga tttgatgcat tggaaggcca 
ggcaaaatgt tttggtcatt tgacgaagca 
gttggctcta gaaggaggac atgatctcac 
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aagggcaccg gctggagcca cttgcaggac 60 
ctgaagaact ctaagccaga tggggtggct 120 
atgcacagta tgatcagctc agtggatgtg 180 
atctcacctt tagacctaag gacagacctc 240 
gtccgtgaga agcaattgca gcaggaatta 300 
aagcagcttc tgatagcaga gtttcagaaa 360 
gctcagcttc aggagcatat caaggaactt 420 
gaaaaggagc agaaactgga gcagcagagg 480 
gaacagcagc ttcctcctct cagaggcaaa 540 
acagaagtaa agcagaagct tcaagagttc 600 
ccaactaatg gaaaaaatca ttccgtgagc 660 
caccacacat cattggatca aagctctcca 720 
tacacattac caggagcaca agatgcaaag 780 
gagcccaact tgaaggtgcg gtccaggtta 840 
cccttactca ggcggaagga tggaaatgtt 900 
gtgacagaat cctcagtcag tagcagttct 960 
gggccaactg gaagtgttac tgaaaatgag 1020 
gagcaaatgg tttcacagca acgcattcta 1080 
ctttatacct ctccttcttt gcccaacatt 1140 
ctcaatgctt cgaattcact caaagaaaag 1200 
ggtgttcctc tgcctgggca gtatggaggc 1260 
gttactttag agggaaagcc acccaacagc 1320 
ttgaaagaac aaatgcgaca gcaaaagctt 1380 
cagtctccct tggcaacaaa agagagaatt 1440 
ccccgtcaca gacccctgaa ccgaacccag 1500 
cagctggtca ttcaacagca acaccagcaa 1560 
cagatccaca tgaacaaact gctttcgaaa 162 0 
caccttgagg aagcagagga agagcttcag 1680 
ccctctagtg gcaacagcac taggagcgac 1740 
caagttgggg ctgtgaaggt caaggaggaa 1800 
caggaaatgg aatctgggga gcaggctgct 1860 
gctccaggat ttgtaattaa agtcattatc 1920 
gcgctctctg tgcgccaagc tccgctggct 1980 
cgtctcgtct ccaggactca ctcttcccct 2040 
gaccgccccc tccagcctgg ctctgcaact 2100 
caccagtgcg tttgtggcaa ttccaccacc 2160 
atctggtcac gactgcaaga aactgggctg 2220 
aaagccagcc tggaggaaat acagcttgtt 2280 
accaaccccc tggacggaca gaagctggac 2340 
aagttttttt cctcattacc ttgtggtgga 2400 
gagctacact cgtccggtgc tgcacgcatg 2460 
aaagtggcct caggagagct gaagaatggg 2520 
gctgaagaat ccacagccat ggggttctgc 2580 
tacttgagag accaactaaa tataagcaag 2640 
ggaaacggta cccagcaggc cttttatgct 2700 
cgctatgatg aagggaactt tttccctggc 2760 
tctttagagc cccactttta tttgtatctt 2820 
attttcttgt cctttgctgg tgttttaaat 2880 
ccaagaacca gtgcagaaca agtgcataac 2940 
ctgatttgat gtgttgtttg atgtttattt 3000 
ttcttgcttt cttccatttg ctctcttctc 3060 
ggttggaaca ggccttggag aagggtacaa 3120 
tcctcccatg ggagatgttg agtaccttga 3180 
caaagagttt gatccagaca tggtcttagt 3240 
cacccctcct ctaggagggt acaaagtgac 3300 
attgatgaca ttggctgatg gacgtgtggt 3360 
agccatctgt gatgcatcag aagcctgtgt 3420 
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aaatgccctt ctaggaaatg agctggagcc acttgcagaa gatattctcc accaaagccc 3480 
gaatatgaat gctgttattt ctttacagaa gatcattgaa attcaaagta tgtctttaaa 3540 
gttctcttaa 3550 

<210> 14 

<211> 7699 . 

<212> DNA 

<213> Homo sapiens 

<400> 14 

cccattcgcc attcaggctg cgcaactgtt gggaagggcg atcggtgcgg gcctcttcgc 60 
tattacgcca gctggcgaaa gggggatgtg ctgcaaggcg attaagttgg gtaacgccca 120 
gggttttccc agtcacgacg ttgtaaaacg acggccagtg ccaagctgat ctaatcaata 180 
ttggccatta gccatattat tcattggtta tatagcataa atcaatattg gctattggcc 240 
attgcatacg ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt 300 
accgccatgt tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt 360 
agttcatagc ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg 420 
cgaccgccca gcgacccccg cccgttgacg tcaatagtga cgtatgttcc catagtaacg 480 
ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac tgcccacttg 540 
gcagtacatc aagtgtatca tatgccaagt ccgcccccta ttgacgtcaa tgacggtaaa 600 
tggcccgcct agcattatgc ccagtacatg accttacggg agtttcctac ttggcagtac 660 
atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta caccaatggg 720 
cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga cgtcaatggg 780 
agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaataa ccccgccccg .840 
ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag agctcgttta 900 
gtgaaccgtc agaattcaag cttgcggccg cagatctatc gatctgcagg atatcaccat 960 
gcacagtatg atcagctcag tggatgtgaa gtcagaagtt cctgtgggcc tggagcccat 1020 
ctcaccttta gacctaagga cagacctcag gatgatgatg cccgtggtgg accctgttgt 1080 
ccgtgagaag caattgcagc aggaattact tcttatccag cagcagcaac aaatccagaa 1140 
gcagcttctg atagcagagt ttcagaaaca gcatgagaac ttgacacggc agcaccaggc 12 00 
tcagcttcag gagcatatca aggaacttct agccataaaa cagcaacaag aactcctaga 1260 
aaaggagcag aaactggagc agcagaggca agaacaggaa gtagagaggc atcgcagaga 1320 
acagcagctt cctcctctca gaggcaaaga tagaggacga gaaagggcag tggcaagtac 1380 
agaagtaaag cagaagcttc aagagttcct actgagtaaa tcagcaacga aagacactcc 1440 
aactaatgga aaaaatcatt ccgtgagccg ccatcccaag ctctggtaca cggctgccca 1500 
ccacacatca ttggatcaaa gctctccacc ccttagtgga acatctccat cctacaagta 1560 
cacattacca ggagcacaag atgcaaagga tgatttcccc cttcgaaaaa ctgcctctga 1620 
gcccaacttg aaggtgcggt ccaggttaaa acagaaagtg gcagagagga gaagcagccc 1680 
cttactcagg cggaaggatg gaaatgttgt cacttcattc aagaagcgaa tgtttgaggt 1740 
gacagaatcc tcagtcagta gcagttctcc aggctctggt cccagttcac caaacaatgg 1800 
gccaactgga agtgttactg aaaatgagac ttcggttttg ccccctaccc ctcatgccga 1860 
gcaaatggtt tcacagcaac gcattctaat tcatgaagat tccatgaacc tgctaagtct 1920 
ttatacctct ccttctttgc ccaacattac cttggggctt cccgcagtgc catcccagct 1980 
caatgcttcg aattcactca aagaaaagca gaagtgtgag acgcagacgc ttaggcaagg 2040 
tgttcctctg cctgggcagt atggaggcag catcccggca tcttccagcc accctcatgt 2100 
tactttagag ggaaagccac ccaacagcag ccaccaggct ctcctgcagc atttattatt 2160 
gaaagaacaa atgcgacagc aaaagcttct tgtagctggt ggagttccct tacatcctca 2220 
gtctcccttg gcaacaaaag agagaatttc acctggcatt agaggtaccc acaaattgcc 2280 
ccgtcacaga cccctgaacc gaacccagtc fcgcacctttg cctcagagca cgttggctca 2340 
gctggtcatt caacagcaac accagcaatt cttggagaag cagaagcaat accagcagca 2400 
gatccacatg aacaaactgc tttcgaaatc tattgaacaa ctgaagcaac cagcrcagtca 2460 
ccttgaggaa gcagaggaag agcttcaggg ggaccaggcg atgcaggaag acagagcgcc 2520 
ctctagtggc aacagcacta ggagcgacag cagtgcttgt gtggatgaca cactgggaca 2580 
agttggggct gtgaaggtca aggaggaacc agtggacagt gatgaagatg ctcagatcca 2640 
ggaaatggaa tctggggagc aggctgcttt tatgcaacag cctttcctgg aacccacgca 2700 
cacacgtgcg ctctctgtgc gccaagctcc gctggctgcg gttggcatgg atggattaga 2760 
gaaacaccgt ctcgtctcca ggactcactc ttcccctgct gcctctgttt tacctcaccc 2820 
agcaatggac cgccccctcc agcctggctc tgcaactgga attgcctatg accccttgat 2880 
gctgaaacac cagtgcgttt gtggcaattc caccacccac cctgagcatg ctggacgaat 2940 
acagagtatc tggtcacgac tgcaagaaac tgggctgcta aataaatgtg agcgaattca 3000 
aggtcgaaaa gccagcctgg aggaaataca gcttgttcat tctgaacatc actcactgtt 3060 
gtatggcacc aaccccctgg acggacagaa gctggacccc aggatactcc taggtgatga 3120 
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ctctcaaaag tttttttcct cattaccttg 
ttggaatgag ctacactcgt ccggtgctgc 
ggcttccaaa gtggcctcag gagagctgaa 
ccatcacgct gaagaatcca cagccatggg 
cgccaaatac ttgagagacc aactaaatat 
tcaccatgga aacggtaccc agcaggcctt 
actccatcgc tatgatgaag ggaacttttt 
aacaggcctt ggagaagggt acaatataaa 
catgggagat gttgagtacc ttgaagcatt 
gtttgatcca gacatggtct tagtatctgc 
tcctctagga gggtacaaag tgacggcaaa 
gacattggct gatggacgtg tggtgttggc 
ctgtgatgca tcagaagcct gtgtaaatgc 
agaagatatt ctccaccaaa gcccgaatat 
tgaaattcaa agtatgtctt taaagttctc 
atgacaagta gatcccgggt ggcatccctg 
ggaagttgcc actccagtgc ccaccagcct 
gtctgactag gtgtcctcta taatattatg 
cccaagttgg gaagacaacc tgtagggcct 
gcagtggcac aatcttggct cactgcaatc 
cctcagcctc ccgagttgtt gggattccag 
tttttttggt agagacgggg tttcaccata 
caggtgatct acccaccttg gcctcccaaa 
cccttccctg tccttctgat tttaaaataa 
cataggctac ctgccatggc ccaaccggtg 
ctctcatgcg ttgggtccac tcagtagatg 
gtggaatgtg tgtcagttag ggtgtggaaa 
gcaaagcatg catctcaatt agtcagcaac 
caggcagaag tatgcaaagc atgcatctca 
ctccgcccat cccgccccta actccgccca 
taattttttt tatttatgca gaggccgagg 
agtgaggagg cttttttgga ggcctaggct 
accagaaagt taattcccta tagtgagtcg 
ttcctgtgtg aaattgttat ccgctcacaa 
agtgtaaagc ctggggtgcc taatgagtga 
tgcccgcttt ccagtcggga aacctgtcgt 
cggggagagg cggtttgcgt attgggcgct 
gctcggtcgt tcggctgcgg cgagcggtat 
ccacagaatc aggggataac gcaggaaaga 
ggaaccgtaa aaaggccgcg ttgctggcgt 
atcacaaaaa tcgacgctca agtcagaggt 
aggcgtttcc ccctggaagc tccctcgtgc 
gatacctgtc cgcctttctc ccttcgggaa 
ggtatctcag ttcggtgtag gtcgttcgct 
ttcagcccga ccgctgcgcc ttatccggta 
acgacttatc gccactggca gcagccactg 
gcggtgctac agagttcttg aagtggtggc 
ttggtatctg cgctctgctg aagccagtta 
ccggcaaaca aaccaccgct ggtagcggtg 
gcagaaaaaa aggatctcaa gaagatcctt 
ggaacgaaaa ctcacgttaa gggattttgg 
agatcctttt aaattaaaaa tgaagtttta 
ggtctgacag ttaccaatgc ttaatcagtg 
gttcatccat agttgcctga ctccccgtcg 
catctggccc cagtgctgca atgataccgc 
cagcaataaa ccagccagcc ggaagggccg 
cctccatcca gtctattaat tgttgccggg 
gtttgcgcaa cgttgttgcc attgctacag 
tggcttcatt cagctccggt tcccaacgat 
gcaaaaaagc ggttagctcc ttcggtcctc 
tgttatcact catggttatg gcagcactgc 
gatgcttttc tgtgactggt gagtactcaa 



tggtggactt ggggtggaca gtgacaccat 3180 
acgcatggct gttggctgtg tcatcgagct 3240 
gaatgggttt gctgttgtga ggccccctgg 3300 
gttctgcttt tttaattcag ttgcaattac 3360 
aagcaagata ttgattgtag atctggatgt 3420 
ttatgctgac cccagcatcc tgtacatttc 3480 
ccctggcagt ggagccccaa atgaggttgg 3540 
tattgcctgg acaggtggcc ttgatcctcc 3600 
caggaccatc gtgaagcctg tggccaaaga 3660 
tggatttgat gcattggaag gccacacccc 3720 
atgttttggt catttgacga agcaattgat 3780 
tctagaagga ggacatgatc tcacagccat 3840 
ccttctagga aatgagctgg agccacttgc 3900 
gaatgctgtt atttctttac agaagatcat 3960 
tggatccggt accagattac aaggacgacg 4020 
tgacccctcc ccagtgcctc tcctggcctt 4080 
tgtcctaata aaattaagtt gcatcatttt 4140 
gggtggaggg gggtggtatg gagcaagggg 4200 
gcggggtcta ttcgggaacc aagctggagt 4260 
tccgcctcct gggttcaagc gattctcctg 4320 
gcatgcatga ccaggctcag ctaatttttg 4380 
ttggccaggc tggtctccaa ctcctaatct 4440 
ttgctgggat tacaggcgtg aaccactgct 4500 
ctataccagc aggaggacgt ccagacacag 4560 
ggacatttga gttgcttgct tggcactgtc 4620 
cctgttgaat tgggtacgcg gccagcttct 4680 
gtccccaggc tccccagcag gcagaagtat 4740 
caggtgtgga aaagtcccca ggctccccag 4800 
attagtcagc aaccatagtc ccgcccctaa 4860 
gttccgccca ttctccgccc catggctgac 4920 
ccgcctcggc ctctgagcta ttccagaagt 4980 
tttgcaaaaa gctcctcgag gaactgaaaa 5040 
tattaaattc gtaatcatgg tcatagctgt 5100 
ttccacacaa catacgagcc ggaagcataa 5160 
gctaactcac attaattgcg ttgcgctcac 5220 
gccagctgca ttaatgaatc ggccaacgcg 5280 
cttccgcttc ctcgctcact gactcgctgc 5340 
cagctcactc aaaggcggta atacggttat 5400 
acatgtgagc aaaaggccag caaaaggcca 5460 
ttttccatag gctccgcccc cctgacgagc 5520 
ggcgaaaccc gacaggacta taaagatacc 5580 
gctctcctgt tccgaccctg ccgcttaccg 5640 
gcgtggcgct ttctcaatgc tcacgctgta 5700 
ccaagctggg ctgtgtgcac gaaccccccg 5760 
actatcgtct tgagtccaac ccggtaagac 5820 
gtaacaggat tagcagagcg aggtatgtag 5880 
ctaactacgg ctacactaga agaacagtat 5940 
ccttcggaaa aagagttggt agctcttgat 6000 
gtttttttgt ttgcaagcag cagattacgc 6060 
tgatcttttc tacggggtct gacgctcagt 6120 
tcatgagatt atcaaaaagg atcttcacct 6180 
aatcaatcta aagtatatat gagtaaactt 6240 
aggcacctat ctcagcgatc tgtctatttc 6300 
tgtagataac tacgatacgg gagggcttac 6360 
gagacccacg ctcaccggct ccagatttat 6420 
agcgcagaag tggtcctgca actttatccg 6480 
aagctagagt aagtagttcg ccagttaata 6540 
gcatcgtggt gtcacgctcg tcgtttggta 6600 
caaggcgagt tacatgatcc cccatgttgt 6660 
cgatcgttgt cagaagtaag ttggccgcag 6720 
ataattctct tactgtcatg ccatccgtaa 6780 
ccaagtcatt ctgagaatag tgtatgcggc 6840 
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gaccgagttg ctcttgcccg gcgtcaatac 
taaaagtgct catcattgga aaacgttctt 
tgttgagatc cagttcgatg taacccactc 
ctttcaccag cgtttctggg tgagcaaaaa 
taagggcgac acggaaatgt tgaatactca 
tttatcaggg ttattgtctc atgagcggat 
aaataggggt tccgcgcaca tttccccgaa 
cattaagcgc ggcgggtgtg gtggttacgc 
tagcgcccgc tcctttcgct ttcttccctt 
gtcaagctct aaatcggggc atccctttag 
accccaaaaa acttgattag ggtgatggtt 
tttttcgccc tttgacgttg gagtccacgt 
gaacaacact caaccctatc tcggtctatt 
cggcctattg gttaaaaaat gagctgattt 
tattaaacgt ttacaattt 



gggataatac cgcgccacat agcagaactt 6900 
cggggcgaaa actctcaagg atcttaccgc 6960 
gtgcacccaa ctgatcttca gcatctttta 7020 
caggaaggca aaatgccgca aaaaagggaa 7080 
tactcttcct ttttcaatat tattgaagca 7140 
acatatttga atgtatttag aaaaataaac 7200 
aagtgccacc tgacgcgccc tgtagcggcg 7260 
gcagcgtgac cgctacactt gccagcgccc 7320 
cctttctcgc cacgttcgcc ggctttcccc 7380 
ggttccgatt tagtgcttta cggcacctcg 7440 
cacgtagtgg gccatcgccc tgatagacgg 7500 
tctttaatag tggactcttg ttccaaactg 7560 
cttttgattt ataagggatt ttgccgattt 7620 
aacaaaaatt taacgcgaat tttaacaaaa 7680 

7699 



<210> 15 

<211> 7303 

<212> DNA 

<213> Homo sapiens 

<400> 15 

cccattcgcc attcaggctg cgcaactgtt 
tattacgcca gctggcgaaa gggggatgtg 
gggttttccc agtcacgacg ttgtaaaacg 
ttggccatta gccatattat tcattggtta 
attgcatacg ttgtatccat atcataatat 
accgccatgt tgacattgat tattgactag 
agttcatagc ccatatatgg agttccgcgt 
cgaccgccca gcgacccccg cccgttgacg 
ccaataggga ctttccattg acgtcaatgg 
gcagtacatc aagtgtatca tatgccaagt 
tggcccgcct agcattatgc ccagtacatg 
atctacgtat tagtcatcgc tattaccatg 
cgtggatagc ggtttgactc acggggattt 
agtttgtttt ggcaccaaaa tcaacgggac 
ttgacgcaaa tgggcggtag gcgtgtacgg 
gtgaaccgtc agaattcaag cttgcggccg 
gcacagtatg atcagctcag tggatgtgaa 
ctcaccttta gacctaagga cagacctcag 
ccgtgagaag caattgcagc aggaattact 
gcagcttctg atagcagagt ttcagaaaca 
tcagcttcag gagcatatca aggaacttct 
aaaggagcag aaactggagc agcagaggca 
acagcagctt cctcctctca gaggcaaaga 
agaagtaaag cagaagcttc aagagttcct 
aactaatgga aaaaatcatt ccgtgagccg 
ccacacatca ttggatcaaa gctctccacc 
cacattacca ggagcacaag atgcaaagga 
gcccaacttg aaggtgcggt ccaggttaaa 
cttactcagg cggaaggatg gaaatgttgt 
gacagaatcc tcagtcagta gcagttctcc 
gccaactgga agtgttactg aaaatgagac 
gcaaatggtt tcacagcaac gcattctaat 
ttatacctct ccttctttgc ccaacattac 
caatgcttcg aattcactca aagaaaagca 
tgttcctctg cctgggcagt atggaggcag 
tactttagag ggaaagccac ccaacagcag 
gaaagaacaa atgcgacagc aaaagcttct 
gtctcccttg gcaacaaaag agagaatttc 



gggaagggcg atcggtgcgg gcctcttcgc 60 
ctgcaaggcg attaagttgg gtaacgccca 120 
acggccagtg ccaagctgat ctaatcaata 180 
tatagcataa atcaatattg gctattggcc 240 
gtacatttat attggctcat gtccaacatt 300 
ttattaatag taatcaatta cggggtcatt 360 
tacataactt acggtaaatg gcccgcctgg 420 
tcaatagtga cgtatgttcc catagtaacg 480 
gtggagtatt tacggtaaac tgcccacttg 540 
ccgcccccta ttgacgtcaa tgacggtaaa 600 
accttacggg agtttcctac ttggcagtac 660 
gtgatgcggt tttggcagta caccaatggg 720 
ccaagtctcc accccattga cgtcaatggg 780 
tttccaaaat gtcgtaataa ccccgccccg 840 
tgggaggtct atataagcag agctcgttta 900 
cagatctatc gatctgcagg atatcaccat 960 
gtcagaagtt cctgtgggcc tggagcccat 1020 
gatgatgatg cccgtggtgg accctgttgt 1080 
tcttatccag cagcagcaac aaatccagaa 1140 
gcatgagaac ttgacacggc agcaccaggc 1200 
agccataaaa cagcaacaag aactcctaga 1260 
agaacaggaa gtagagaggc atcgcagaga 1320 
tagaggacga gaaagggcag tggcaagtac 1380 
actgagtaaa tcagcaacga aagacactcc 1440 
ccatcccaag ctctggtaca cggctgccca 1500 
ccttagtgga acatctccat cctacaagta 1560 
tgatttcccc cttcgaaaaa ctgcctctga 1620 
acagaaagtg gcagagagga gaagcagccc 1680 
cacttcattc aagaagcgaa tgtttgaggt 1740 
aggctctggt cccagttcac caaacaatgg 1800 
ttcggttttg ccccctaccc ctcatgccga 1860 
tcatgaagat tccatgaacc tgctaagtct 1920 
cttggggctt cccgcagtgc catcccagct 1980 
gaagtgtgag acgcagacgc ttaggcaagg 2040 
catcccggca tcttccagcc accctcatgt 2100 
ccaccaggct ctcctgcagc atttattatt 2160 
tgtagctggt ggagttccct tacatcctca 2220 
acctggcatt agaggtaccc acaaattgcc 2280 
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ccgtcacaga cccctgaacc gaacccagtc 
gctggtcatt caacagcaac accagcaatt 
gatccacatg aacaaactgc tttcgaaatc 
ccttgaggaa gcagaggaag agcttcaggg 
ctctagtggc aacagcacta ggagcgacag 
agttggggct gtgaaggtca aggaggaacc 
ggaaatggaa tctggggagc aggctgcttt 
cacacgtgcg ctctctgtgc gccaagctcc 
gaaacaccgt ctcgtctcca ggactcactc 
agcaatggac cgccccctcc agcctggctc 
gctgaaacac cagtgcgttt gtggcaattc 
acagagtatc tggtcacgac tgcaagaaac 
aggtcgaaaa gccagcctgg aggaaataca 
gtatggcacc aaccccctgg acggacagaa 
ctctcaaaag tttttttcct cattaccttg 
ttggaatgag ctacactcgt ccggtgctgc 
ggcttccaaa gtggcctcag gagagctgaa 
ccatcacgct gaagaatcca cagccatggg 
cgccaaatac ttgagagacc aactaaatat 
tcaccatgga aacggtaccc agcaggcctt 
actccatcgc tatgatgaag ggaacttttt 
gtttatttct ttagagcccc acttttattt 
cggtaccaga ttacaaggac gacgatgaca 
ctccccagtg cctctcctgg ccttggaagt 
aataaaatta agttgcatca ttttgtctga 
aggggggtgg tatggagcaa ggggcccaag 
tctattcggg aaccaagctg gagtgcagtg 
tcctgggttc aagcgattct cctgcctcag 
atgaccaggc tcagctaatt tttgtttttt 
aggctggtct ccaactccta atctcaggtg 
ggattacagg cgtgaaccac tgctcccttc 
cagcaggagg acgtccagac acagcatagg 
ttgagttgct tgcttggcac tgtcctctca 
gaattgggta cgcggccagc ttctgtggaa 
aggctcccca gcaggcagaa gtatgcaaag 
tggaaaagtc cccaggctcc ccagcaggca 
cagcaaccat agtcccgccc ctaactccgc 
cccattctcc gccccatggc tgactaattt 
cggcctctga gctattccag aagtagtgag 
aaaagctcct cgaggaactg aaaaaccaga 
attcgtaatc atggtcatag ctgtttcctg 
acaacatacg agccggaagc ataaagtgta 
tcacattaat tgcgttgcgc tcactgcccg 
tgcattaatg aatcggccaa cgcgcgggga 
cttcctcgct cactgactcg ctgcgctcgg 
actcaaaggc ggtaatacgg ttatccacag 
gagcaaaagg ccagcaaaag gccaggaacc 
ataggctccg cccccctgac gagcatcaca 
acccgacagg actataaaga taccaggcgt 
ctgttccgac cctgccgctt accggatacc 
cgctttctca atgctcacgc tgtaggtatc 
tgggctgtgt gcacgaaccc cccgttcagc 
gtcttgagtc caacccggta agacacgact 
ggattagcag agcgaggtat gtaggcggtg 
acggctacac tagaagaaca gtatttggta 
gaaaaagagt tggtagctct tgatccggca 
ttgtttgcaa gcagcagatt acgcgcagaa 
tttctacggg gtctgacgct cagtggaacg 
gattatcaaa aaggatcttc acctagatcc 
tctaaagtat atatgagtaa acttggtctg 
ctatctcagc gatctgtcta tttcgttcat 
taactacgat acgggagggc ttaccatctg 



tgcacctttg cctcagagca cgttggctca 2340 
cttggagaag cagaagcaat accagcagca 2400 
tattgaacaa ctgaagcaac caggcagtca 2460 
ggaccaggcg atgcaggaag acagagcgcc 2520 
cagtgcttgt gtggatgaca cactgggaca 2580 
agtggacagt gatgaagatg ctcagatcca 2640 
tatgcaacag cctttcctgg aacccacgca 2700 
gctggctgcg gttggcatgg atggattaga 2760 
ttcccctgct gcctctgttt tacctcaccc 2820 
tgcaactgga attgcctatg accccttgat 2880 
caccacccac cctgagcatg ctggacgaat 2940 
tgggctgcta aataaatgtg agcgaattca 3000 
get tgt teat tctgaacatc actcactgtt 3060 
gctggacccc aggatactcc taggtgatga 3120 
tggtggactt ggggtggaca gtgacaccat 3180 
aegcatgget gttggctgtg tcatcgagct 3240 
gaatgggttt gctgttgtga ggccccctgg 3300 
gttctgcttt tttaattcag ttgeaattae 3360 
aagcaagata ttgattgtag atctggatgt 3420 
ttatgetgae cccagcatcc tgtacatttc 3480 
ccctggcagt ggagccccaa atgaggttcg 3540 
gtatctttca ggtaattgea ttgeaggate 3600 
agtagatccc gggtggcatc cctgtgaccc 3660 
tgccactcca gtgcccacca gccttgtcct 3720 
ctaggtgtcc tctataatat tatggggtgg 3780 
ttgggaagac aacctgtagg gectgegggg 3840 
gcacaatctt ggctcactgc aatctccgcc 3900 
cctcccgagt tgttgggatt ccaggcatgc 3960 
tggtagagac ggggtttcac catattggee 4020 
atctacccac cttggcctcc caaattgetg 4080 
cctgtccttc tgattttaaa ataactatac 4140 
ctacctgcca tggcccaacc ggtgggacat 4200 
tgcgttgggt ccactcagta gatgcctgtt 4260 
tgtgtgtcag ttagggtgtg gaaagtcccc 4320 
catgcatctc aattagtcag caaccaggtg 4380 
gaagtatgea aagcatgeat ctcaattagt 4440 
ccatcccgcc cctaactccg cccagttccg 4500 
tttttattta tgeagaggee gaggccgcct 4560 
gaggcttttt tggaggecta ggcttttgea 4620 
aagttaattc cctatagtga gtcgtattaa 4680 
tgtgaaattg ttatccgctc acaattccac 4740 
aagcctgggg tgcctaatga gtgagctaac 4800 
ctttccagtc gggaaacctg tcgtgccagc 4860 
gaggcggttt gcgtattggg cgctcttccg 4920 
tegttegget gcggcgagcg gtatcagctc 4980 
aatcagggga taacgcagga aagaacatgt 5040 
gtaaaaaggc cgcgttgctg gcgtttttcc 5100 
aaaatcgacg ctcaagtcag aggtggcgaa 5160 
ttccccctgg aagctccctc gtgcgctctc 5220 
tgtccgcctt tctcccttcg ggaagcgtgg 52 80 
tcagttcggt gtaggtcgtt cgctccaagc 5340 
ccgaccgctg cgccttatcc ggtaactatc 5400 
tatcgccact ggcagcagcc actggtaaca 5460 
ctacagagtt cttgaagtgg tggectaact 5520 
tctgcgctct getgaageca gttaccttcg 5580 
aacaaaccac cgctggtagc ggtggttttt 5640 
aaaaaggatc tcaagaagat cctttgatct 57 00 
aaaactcacg ttaagggatt ttggtcatga 5760 
ttttaaatta aaaatgaagt tttaaatcaa 5820 
acagttacca atgettaate agtgaggcac 5880 
ccatagttgc ctgactcccc gtcgtgtaga 5940 
gccccagtgc tgcaatgata ccgcgagacc 6000 



WO 02/102984 



PCT/US02/19051 



24/25 



cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg gccgagcgca 6060 
gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc cgggaagcta 6120 
gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct acaggcatcg 6180 
tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa cgatcaaggc 6240 
gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg 6300 
ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca ctgcataatt 6360 
ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac tcaaccaagt 6420 
cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca atacgggata 6480 
ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt tcttcggggc 6540 
gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc actcgtgcac 6600 
ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa 6660 
ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct 6720 
tcctttttca atattattga agcatttatc agggttattg tctcatgagc ggatacatat 6780 
ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc 6840 
cacctgacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt acgcgcagcg 6900 
tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc ccttcctttc 6960 
tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg gggcatccct ttagggttcc 7020 
gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat ggttcacgta 7080 
gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc acgttcttta 7140 
atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc tattcttttg 7200 
atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg atttaacaaa 7260 
aatttaacgc gaattttaac aaaatattaa acgtttacaa ttt 7303 



<210> 16 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 



<210> 17 
<211> 23 . 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 
<400> 17 

cactccatcg ctatgatgaa ggg 23 

<210> 18 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 



<400> 16 

ccatggaaac ggtacccagc aggc 



24 



<400> 18 

agttcccttc atcatagcga tgg 
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<220> 

<223> Primer used to amplify human DNA 
<400> 19 

aatgtacagg atgctggggt 

<210> 20 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 
<400> 20 

cccttgtagc tggtggagtt ccctt 

<210> 21 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 
<400> 21 

tgtgtcatcg agctggcttc 

<210> 22 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer used to amplify human DNA 



<400> 22 

atcttctgca agtggctcca 



20 
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